Industrial Engineering Grad Set to Use Data Analysis on Real-World Issues
There’s a saying about how the journey is more important than the destination. For Varun Senthil, MS ’20, industrial engineering, the destination may not yet be finalized so it’s difficult to compare it to his journey. But his journey, which included an impressive co-op at Tennessee-based Oak Ridge National Labs, is far from over.
While at Oak Ridge, Senthil co-wrote a soon-to-be published paper, “Benchmarking Cassandra vs Timescale DB for High Volume Data Application.” He also extended his four-month contract to eight months, in a role where he worked on system architecture, bash scripting and SQL, and on streamlining a process pipeline with Python.
Senthil admitted that his trek to becoming a fully functional database administrator and data analyst was a bit of a surprise—to himself at least. He started as a manufacturing engineer, receiving his undergrad degree from the Vellore Institute of Technology in India.
As a production engineer, Senthil wanted to expand his process optimization capabilities beyond a single manufacturing plant, which led him to pursue and receive a graduate certificate in data analytics engineering.
For his paper, Senthil was quick to praise his mentors Harold Shanafield and Mitchell Broxon. The duo motivated him to standardize the data application process and publish the research, as not many studies have yet been carried out on Timescale DB.
“My group, the Atmospheric Radiation Measurement Data Center (ARM Data Center) was looking for alternates to Cassandra Database,” Senthil said. “Timescale DB looked like a promising alternative as it was new and a lot of companies have started adopting it. It is a time-series database optimized for fast data ingest but also provides full querying capabilities.”
It essentially merges the best of both relational and non-relational databases, he explained. His group then explored how they could apply that application to the lab’s data demand, and if the shift in infrastructure from Cassandra Database was worth it.
“I used a time scale benchmark suite, which essentially calculates the bulk load and query execution performance of both these databases,” Senthil said. “We ran a simple aggregate query with varied time intervals, and with single and multiple hosts to check the execution time.”
Timescale DB is a “beautiful platform,” he added because it merges the full querying capabilities of a relational database while also giving the data writing speed of a non-relational database.
“It is the best option for a database where data is being inserted and retrieved on the scale of over thousands of rows per second,” Senthil explained.
Using data to solve global problems
While Senthil is not sure where his final destination lies, he knows what he wants to accomplish on the way there.
“If I am to find meaning at work, it would be in a space where I help people who are trying to mitigate the ripples caused by overpopulation, with climate change being one major effect,” he said.
Ensuring food security, reducing overall dependence on gasoline, and urban planning are also key issues that need greater attention. There should also be a focus on decentralizing cities that are close to the ocean, as water levels continue to rise each year.
Lots of startups in varying industries are trying to find ways to solve those issues, he said.
“I want to find myself in some of those spaces, using my data skills to optimize their processes,” Senthil said. “I could optimize a company’s operational efficiency so that it can better use its resources and better target the problem it is tackling.”