C
C#2y ago
hanu

❔ What is the recommended way to read excel(*.xlsx) files?

in .NET Framework 4.8. I don't need any info of fonts, colors, styles, or anything else; i only need the texts i get when i convert excel into CSV. thanks!
20 Replies
Esa
Esa2y ago
Is it possible for you to work with something else than .xlsx? I've had more problems with the format and ingestion than any other format.
hanu
hanu2y ago
what i've done is converting the .xlsx file into .csv and reading the csv files. i'm looking for some ways to automate the converting process
cumslvt13
cumslvt132y ago
GitHub
GitHub - EPPlusSoftware/EPPlus: EPPlus-Excel spreadsheets for .NET
EPPlus-Excel spreadsheets for .NET. Contribute to EPPlusSoftware/EPPlus development by creating an account on GitHub.
cumslvt13
cumslvt132y ago
There are export options, which includes ToText https://github.com/EPPlusSoftware/EPPlus/wiki/Export-data
GitHub
Export data
EPPlus-Excel spreadsheets for .NET. Contribute to EPPlusSoftware/EPPlus development by creating an account on GitHub.
cumslvt13
cumslvt132y ago
But mind that there are some license changes since v6, but for most of the tasks v5 will be enough
ineternet
ineternet2y ago
ive had success reading .docx files by opening them as zip files, you can try the same with .xlsx
MarkPflug
MarkPflug2y ago
@한우 I maintain a couple libraries that make .xlsx => .csv about as easy as it can get: https://github.com/MarkPflug/Sylvan.Data.Excel#converting-excel-data-to-csvs
GitHub
GitHub - MarkPflug/Sylvan.Data.Excel: The fastest .NET library for ...
The fastest .NET library for reading Excel data files. - GitHub - MarkPflug/Sylvan.Data.Excel: The fastest .NET library for reading Excel data files.
hanu
hanu2y ago
it looks really cool but i'm currently developing on .NET 4.8 thanks anyways
HowNiceOfYou
HowNiceOfYou2y ago
I'd happily provide code for that if you'd like. Do you have excel installed?
hanu
hanu2y ago
yup
HowNiceOfYou
HowNiceOfYou2y ago
Perfect, now this code is not generally recommend as it can be slow, but here's a example on how you do it using Excel
using Microsoft.Office.Interop.Excel;

class Program
{
static void Main(string[] args)
{
Application excel = new Application();
Workbook workbook = excel.Workbooks.Open(@"C:\myfile.xlsx");
Worksheet worksheet = workbook.Worksheets[1];

Range range = worksheet.UsedRange;
object[,] data = range.Value;

for (int i = 1; i <= range.Rows.Count; i++)
{
for (int j = 1; j <= range.Columns.Count; j++)
{
Console.Write(data[i, j] + " ");
}
Console.WriteLine();
}

workbook.Close();
excel.Quit();
}
}
using Microsoft.Office.Interop.Excel;

class Program
{
static void Main(string[] args)
{
Application excel = new Application();
Workbook workbook = excel.Workbooks.Open(@"C:\myfile.xlsx");
Worksheet worksheet = workbook.Worksheets[1];

Range range = worksheet.UsedRange;
object[,] data = range.Value;

for (int i = 1; i <= range.Rows.Count; i++)
{
for (int j = 1; j <= range.Columns.Count; j++)
{
Console.Write(data[i, j] + " ");
}
Console.WriteLine();
}

workbook.Close();
excel.Quit();
}
}
If you're looking for speed, I'd recommend Open XML SDK. They work on xlsx files. Here's an example.
using System.IO;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;

class Program
{
static void Main(string[] args)
{
using (SpreadsheetDocument document = SpreadsheetDocument.Open(@"C:\myfile.xlsx", false))
{
WorkbookPart workbookPart = document.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();

foreach (Row row in sheetData.Elements<Row>())
{
foreach (Cell cell in row.Elements<Cell>())
{
Console.Write(GetCellValue(workbookPart, cell) + " ");
}
Console.WriteLine();
}
}
}

static string GetCellValue(WorkbookPart workbookPart, Cell cell)
{
string cellValue = cell.CellValue.InnerXml;

if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
{
SharedStringItem sharedString = workbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(int.Parse(cellValue));
cellValue = sharedString.Text.Text;
}

return cellValue;
}
}
using System.IO;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;

class Program
{
static void Main(string[] args)
{
using (SpreadsheetDocument document = SpreadsheetDocument.Open(@"C:\myfile.xlsx", false))
{
WorkbookPart workbookPart = document.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();

foreach (Row row in sheetData.Elements<Row>())
{
foreach (Cell cell in row.Elements<Cell>())
{
Console.Write(GetCellValue(workbookPart, cell) + " ");
}
Console.WriteLine();
}
}
}

static string GetCellValue(WorkbookPart workbookPart, Cell cell)
{
string cellValue = cell.CellValue.InnerXml;

if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
{
SharedStringItem sharedString = workbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(int.Parse(cellValue));
cellValue = sharedString.Text.Text;
}

return cellValue;
}
}
Both of those methods work on 4.8, tried and tested.
hanu
hanu2y ago
wow, thanks!
HowNiceOfYou
HowNiceOfYou2y ago
Yeah man. Enjoy.
hanu
hanu2y ago
hmm one question. i'm gonna publish (i already did, actually) the program. then will user be have to have excel installed in their computers?
HowNiceOfYou
HowNiceOfYou2y ago
The top one yes, the open xml? No.
hanu
hanu2y ago
thanks!
HowNiceOfYou
HowNiceOfYou2y ago
Yep. Anytime.
MarkPflug
MarkPflug2y ago
Uh, it supports netstandard 2.0, which covers .NET 4.8. Did you try it?
hanu
hanu2y ago
oh i didn't notice that
Accord
Accord2y ago
Was this issue resolved? If so, run /close - otherwise I will mark this as stale and this post will be archived until there is new activity.