Skip to content

Add TPCDS benchmarking utility class#1

Open
shanky-259 wants to merge 5 commits intomasterfrom
shashanks/add-tpcds-benchmark-util
Open

Add TPCDS benchmarking utility class#1
shanky-259 wants to merge 5 commits intomasterfrom
shashanks/add-tpcds-benchmark-util

Conversation

@shanky-259
Copy link

This PR intends to add a benchmarking utility class that can help run a stand alone spark benchmarking application to benchmark TPCDS data.

@shanky-259 shanky-259 requested a review from micaelal May 28, 2024 22:44
val conf = new SparkConf()
conf.set("spark.hadoop.hive.exec.scratchdir", "/tmp/hive-scratch")
// Add the following to set hive metastore uri
// ("spark.hadoop.hive.metastore.uris", "thrift://pdxa-axg-17-vm1.prod.twttr.net:31131")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please remove our hostname?

conf.set("spark.hadoop.hive.exec.scratchdir", "/tmp/hive-scratch")
// Add the following to set hive metastore uri
// ("spark.hadoop.hive.metastore.uris", "thrift://pdxa-axg-17-vm1.prod.twttr.net:31131")
conf.set("spark.hadoop.hive.metastore.kerberos.principal", "hive/hive-metastore@TWITTER.BIZ")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And principal too

numPartitions: Int = 100)
numPartitions: Int = 100,
databaseName: String = "",
createHiveTableEnabled: Boolean = false)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please also update the doc on how to use with newly added options?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants