I might be missing something obvious here but I can't find any info -- can anyone help me to read a .xlsx file from S3 to python? I know how to do it for csv files, but not xlsx.
Best answer by TomGrundy
View originalI might be missing something obvious here but I can't find any info -- can anyone help me to read a .xlsx file from S3 to python? I know how to do it for csv files, but not xlsx.
Best answer by TomGrundy
View originalYes I can help!
The easiest way is to read the file into a io.BytesIO
buffer and then pass to pandas ie
buff = io.BytesIO(source.read())
df = pd.read_excel(buff)
apparently you can also put the S3 path into pandas directly. If you’re using a source then you can grab the bucket and key and then turn that into an S3 path.
FYI: Pandas might moan and ask you to install a package to read the excel file
No account yet? Create an account
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.