Conversation
|
is this pull request going to be merged? Looks useful and wondered if anyone has tested it? I'm new with redshift and trying to build a process for writing small and big tables into redshift; guess two approaches will be needed. Also having trouble with search path settings: writing tables always goes into the public schema, meaning the "." Is ignored in the database name, so "test.test" gets written to the public schema as test.test rather than as table "test" in the "test" schema. Any tips on this would be greatly appreciated! |
|
Hi, thanks for the reminder. I think the biggest issue is that it kind of assumes you're inserting textual data and assumes you're writing all columns at once. It'd be much better/reliable to handle the specific columns being inserted and generate the string appropriately- currently including the escaping character would break the insert. As for inserting data- our experience has been that using |
|
Agree with Paul here. Bulk uploads are very inefficient using the above. I had written this with the primary objective of not having to have a separate piece of code / script to upload small lookup files to redshift- each no more than a few MBs (single digits!). |
|
Thanks for the info! I have now heard that the AWS best-practice would be
R-dataframe ---> S3 ----> Redshift (via a Copy command)
I see this package/function from redshiftTools to automate this process:
redshiftTools::rs_replace_table(my_data, dbcon=con,
tableName='mytable', bucket="mybucket")
(source: https://www.r-bloggers.com/how-to-bulk-upload-your-data-from-r-into-redshift/)
Can either of you recommend this or another function to perform the
data-frame to Redshift conversion for large tables? I'm new to
Redshift in a new position and trying to figure out a
production-quality pipeline for pushing transformed data out of an R
dataframe into Redshift.
Thank you very much for any tips!!
Derek
…On Fri, Jan 20, 2017 at 10:51 AM, Eeshan Chatterjee < ***@***.***> wrote:
Agree with Paul here. Bulk uploads are very inefficient using the above. I
had written this with the primary objective of not having to have a
separate piece of code / script to upload small lookup files to redshift-
each no more than a few MBs (single digits!).
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#12 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/APwNq2MA8ETW545ufADfpLhLB0HqmvOsks5rUNflgaJpZM4F9nZG>
.
|
Added function to insert an R data.frame into a redshift table. Created the query automatically.