How to read a merge xlsx file in python 3.x [on hold] - python

I am having challenges in reading excel merge excel file using python and I don't know if anybody can be of help.
I want to read itinto database in the following format
|Countries|Continent|North America|122|
|Countries|Continent|North America|123|
|Countries|Continent|North America|124|
|Countries|Continent|Africa|343|
|Countries|Continent|Africa|434|
|Countries|Continent|Africa|345|
|Countries|Continent|Asia|465|
...............
|Cars|Car Brands|Toyota|54|
|Cars|Car Brands|Toyota|35|
...............
|Food|Food Type|Legumes|34|
...............
Thanks in advance

Read with excel sheet name
import pandas as pd
df = pd.read_excel('Example_workbook.xlsm', sheet_name = 'Sheet name')
Read with excel sheet number
xls = pd.ExcelFile('path_to_file.xls')
sheet1 = xls.parse(0)

Related

wrong output in converting date format data from an excel sheet to csv file in python

I have this excel sheet and I am trying to convert this excel sheet to csv file. Among the columns in this sheet is a column with data in date format(like 7/4/2017). I wrote this code but this is not converting the date field data correctly:
import xlrd
import csv
def Excel2CSV(ExcelFile, SheetName, CSVFile):
workbook = xlrd.open_workbook(ExcelFile)
worksheet = workbook.sheet_by_name(SheetName)
csvfile = open(CSVFile, 'w',encoding='utf8')
wr = csv.writer(csvfile,delimiter=';')
for rownum in range(worksheet.nrows):
wr.writerow(worksheet.row_values(rownum))
csvfile.close()
My sample data in excel is like this:
4/7/2017 value02 value03
4/5/2017 value12 value13
4/14/2017 value22 value23
4/10/2017 value32 value33
When I execute my above code this is what see in output:
42832.0;value02;value03
42830.0;value12;value13
42839.0;value22;value23
42835.0;value32;value33
As you can see that the date filed data is not getting converted correctly. What mistake I am making here?
Assuming you are using the XLRD package for reading the file you can find the answer at http://xlrd.readthedocs.io/en/latest/dates.html
Basically dates are stored as 'number of days since.....' and just formatted to appear as dates when viewed in Excel.
There are more details here
http://xlrd.readthedocs.io/en/latest/api.html#module-xlrd.xldate
'xldate_as_tuple' is the function you want

python2.7 xlsxwriter doesn't write more then 65533 rows

Hi I am tring to create a xlsx file but, I can write more the 65533 rows
the object I am using is Xlsxwriter?
how can I write more rows?
Simple test:
import xlsxwriter
workbook = xlsxwriter.Workbook("test.xlsx")
sheet = workbook.add_worksheet()
for i in range(1, 70000):
sheet.write('B' + str(i), "Blub")
workbook.close()
After that I have opened the file test.xlsx in Libreoffice, eh voila :)

Reading strikethroughs in Excel with Pandas

Is there a way for me to read strikethroughs in Pandas without resorting to VBA? I have a spreadsheet where a person refuses to use anything else. Thanks in advanced!
This isn't possible in Pandas but would be using xlrd. You could first parse for formatting and then turn that into a Pandas dataframe.
Sample code to read a file using xlrd. Adapted the response to a similar question:
import xlrd.open_workbook
workbook = xlrd.open_workbook('tmp.xls', formatting_info=True)
sheet = wb.sheet_by_name("1")
cell = sheet.cell(6, 0)
format = wb.xf_list[cell.xf_index]
print "type(format) is", type(format)
print
print "format.dump():"
format.dump()

Question on python text file read write using Win32com

I am reading a text file a.txt which has so many columns and i need the data in 5 columns.
I fetched it by reading and get data of 5 columns,but my question is how can I import these 5 columns data into one excel file having same column as in text file.
Here is an example explaining how to generate an excel file with the xlwt python lib.
Using win32 would be ok but i personnaly prefer to use xlwt because it doesn't require excel to be installed on the machine.
import xlwt
wb = xlwt.Workbook()
ws = wb.add_sheet("My sheet")
for line in xrange(10):
for col in xrange(5):
ws.write(line, col, line*10+col)
wb.save('myfile.xls')
I hope it helps

convert a tsv file to xls/xlsx using python

I want to convert a file in tsv format to xls/xlsx..
I tried using
os.rename("sample.tsv","sample.xlsx")
But the file getting converted is corrupted. Is there any other method of doing it?
Here is a simple example of converting TSV to XLSX using XlsxWriter and the core csv module:
import csv
from xlsxwriter.workbook import Workbook
# Add some command-line logic to read the file names.
tsv_file = 'sample.tsv'
xlsx_file = 'sample.xlsx'
# Create an XlsxWriter workbook object and add a worksheet.
workbook = Workbook(xlsx_file)
worksheet = workbook.add_worksheet()
# Create a TSV file reader.
tsv_reader = csv.reader(open(tsv_file, 'rb'), delimiter='\t')
# Read the row data from the TSV file and write it to the XLSX file.
for row, data in enumerate(tsv_reader):
worksheet.write_row(row, 0, data)
# Close the XLSX file.
workbook.close()
You need:
Read the data from the tsv file.
Convert it in what you want them to be.
Write them to an Excel file with openpyxl for xlsx or xlwt for xls.

Resources