Karun Thankachan is a Senior Information Scientist who makes a speciality of Recommender Methods and Data Retrieval. Over time, he has labored throughout a number of industries corresponding to e-commerce, fintech, PXT, and edtech. At the moment, he’s a part of the Walmart e-Commerce workforce, the place he helps enhance merchandise choice and availability utilizing machine studying. Exterior of labor, Karun is lively within the information science neighborhood – as a mentor, speaker, and content material creator.
On this unique interview, Karun shares insights from his journey, how he broke into information science, classes from scaling programs at high firms, and his recommendation for these navigating the world of Information Science.

Karun’s Notable Achievements
- Printed researcher with a number of papers in machine studying
- Holds 2 patents within the area of synthetic intelligence
- Editorial board member for IJDKP and JDS
- Information Science mentor on Topmate
- Named a High 50 Topmate Creator in North America (2024)
- Acknowledged as a High 10 Information Mentor within the USA (2025)
- Perplexity Enterprise Fellow
- Adopted by 70,000+ professionals on LinkedIn
- Co-founder of BuildML, a neighborhood for analysis paper discussions and project-based studying
Checkout his LinkedIn profile right here.
Q1. Are you able to share your journey into information science and what led you to your present place as a Senior Information Scientist at Walmart?
I began my profession, like many others in India, as a Software program Growth Engineer at Dell Applied sciences. Throughout that point, I met the Director of Software program Engineering, who needed to launch an analytics wing for our engineering initiatives. I used to be lucky to be one of many first individuals chosen to work on it.
That led me to information analytics initiatives, constructing Hadoop clusters, and doing large-scale predictive analytics. Over time, this naturally developed into machine studying work. After two years within the position, I felt I used to be hitting a plateau and determined to pursue a grasp’s in Information Science.
I ready for the GRE, and with a rating of 332/340, a strong GPA, and a patent to my title, I felt assured about getting right into a high U.S. college.
However in 2018, I used to be rejected by each college I utilized to. It was a humbling expertise.
I took a step again and reassessed my software. I reached out to alumni, graduate college students, and professors, many via LinkedIn for trustworthy suggestions. With their assist, I improved my profile and was ultimately admitted to my dream program: a Grasp’s in Computational Information Science at Carnegie Mellon College. This system was rigorous however extremely rewarding. It helped me land my first Information Science position at Amazon.
The tempo at Amazon was intense and pushed me to develop shortly. The expertise I gained there led me to my present position as a Senior Information Scientist at Walmart e-commerce, the place I lead the workstream targeted on bettering merchandise availability.
Q2. Out of your expertise, what’s your recommendation for information science professionals to extend callbacks and stand out in job functions?
Getting a callback does rely a bit on timing and luck. However there are nonetheless three necessary issues you’ll be able to management to extend your possibilities:
- Select related initiatives
- Construct a robust resume and LinkedIn profile
- Develop a community that may refer you (extra on this later)
Information Science Initiatives
In relation to initiatives for Information Science roles, your aim is to indicate a number of key expertise:
- The flexibility to extract insights and engineer options. This contains duties like cleansing information (dealing with outliers, lacking values, imbalance, encoding), recognizing information patterns (like skewed distributions or dependencies), and creating helpful options for modeling
- Expertise with mannequin becoming and fine-tuning. You must have the ability to translate enterprise issues into machine studying issues, choose the best metrics and fashions, and fine-tune these fashions for higher efficiency
- Talent in analyzing mannequin errors and bettering model one. This implies figuring out the place your mannequin fails, why it fails, and deciding whether or not to repair it via information enhancements or higher strategies
- Consolation with constructing production-ready pipelines. This contains designing machine studying pipelines that may run at scale. You must perceive:
- Cloud platforms like AWS, GCP, or Azure (particularly compute, storage, and mannequin serving)
- Orchestration instruments like Airflow or SageMaker Pipelines
- Containerization with instruments like Docker
As soon as your portfolio appears to be like sturdy, the following step is to promote it properly. That begins with writing a strong resume. Concentrate on three issues: construction, content material, and tailoring.
Resume Constructing Suggestions
For construction, your resume ought to have these major sections:
- Profession abstract
- Expertise and initiatives
- Schooling
- Abilities
It’s higher to maintain expertise on the finish. They matter extra for an applicant monitoring system than for a human reviewer.
For content material, make certain every bullet level can stand alone. Don’t anticipate the reader to attach it with earlier bullets. Every level ought to comply with the PSI format – drawback, resolution, and influence:
- What was the issue you have been fixing? Remember to point out the enterprise area
- What was your technical resolution? Be particular. Use mannequin names (like XGBoost, VGG, Llama) and strategies (like segmentation evaluation or root trigger evaluation)
- What was the influence? Use numbers. If it was a enterprise mission, present enterprise metrics. If it was a private or tutorial mission, evaluate your mannequin’s outcomes to a baseline
Within the expertise part, think about including a subsection referred to as “Competencies.” That is the place you record key phrases related to your position – like Python, SQL, A/B testing, segmentation evaluation, and so forth.
Lastly, let’s speak about tailoring your resume. You don’t must rewrite all of your bullet factors for every position. That takes an excessive amount of time, particularly whenever you’re making use of to many roles.
As an alternative, create themed resumes.
Make a number of variations of your resume primarily based on the enterprise areas you’ve labored in. For instance, in case your expertise is in each advertising and provide chain, you’ll be able to create three variations: one targeted on advertising, one on provide chain, and one that mixes each.
Q3. How can aspiring information scientists successfully construct their networks to get extra referrals for jobs?
Listed here are three issues I like to recommend for rising your community and getting referrals:
Join on LinkedIn
You’ll be able to ship as much as 100 connection requests per week. Attempt to use all of them. Hold your message quick and related. For instance:
Hello, I wish to request a referral for Job ID: ABC. Given my work on <briefly describe a associated mission utilizing the PSI format>, I imagine I’m a great match for the position. In case your profile is powerful and the message is personalised, anticipate 4–6 responses out of each 100. When somebody accepts your request, comply with up with a well mannered message to begin a dialog.
Chilly Emails
This method can be a numbers sport. Attain out to individuals working at your goal firms. Make your emails transient, clear, and respectful. For a great instance, try Leon Jose’s put up on this matter.
In-Individual Networking
Attend conferences, meetups, and hackathons. Use platforms like Analytics Vidhya, Meetup, Eventbrite, and LinkedIn Occasions to search out each in-person and digital occasions. Bigger cities typically host startup occasions and industry-specific conferences. These are nice locations to fulfill recruiters and different professionals. Competitions and hackathons are additionally helpful for exhibiting your expertise and constructing significant connections.
Take a look at Angelica Spratley’s put up for extra on discovering communities and in-person occasions.
This fall. What sort of initiatives do you suggest for somebody aspiring to change into a knowledge scientist?
For Information Science roles, your mission ought to reveal these key capabilities:
- Function Engineering & Information Insights: Present that you would be able to clear information (deal with outliers, lacking values, imbalance, encoding), perceive information nuances (skewed distributions, dependencies), and create predictive options
- Mannequin Growth & Tuning: Convert enterprise issues into ML issues, choose applicable fashions and metrics, and fine-tune fashions successfully
- Error Evaluation & Iteration: Analyze the place the mannequin is falling quick, and resolve whether or not to enhance efficiency via higher strategies/fashions or by revisiting the information
- Manufacturing-Prepared Pipelines: Spotlight your means to design scalable coaching and inference pipelines utilizing:
- Cloud Platforms: AWS, GCP, or Azure (deal with compute, storage, and serving)
- Orchestration Instruments: Airflow, SageMaker Pipelines
- Containerization: Docker
You should utilize a guided mission to get began. Listed here are a number of – Forecasting Gross sales, Advice System, XGBoost Based mostly Prediction.
Information Science Initiatives
Guided initiatives might help perceive easy methods to go about creating initiatives. Nevertheless, you would want to dive into information by yourself as properly. Listed here are a number of initiatives you’ll be able to look into to reveal the above:
Instacart Market Basket Evaluation and Subsequent-Merchandise Prediction
Predicting what the consumer will buy subsequent – an evergreen enterprise drawback. Additionally this competitors has the answer for the highest positioned resolution obtainable publicly. As such, supplies the chance to breed, analyze shortcoming and work on enhancements.
Checkout this mission right here.
Walmart Gross sales Forecasting
Ample alternative to showcase the flexibility to wash information (outliers), match and tune fashions (can experiment with statistical fashions like ARIMA to DNN fashions like LSTM), and enhance on v1 fashions by including exterior information (gross sales information, SNAP days, climate, and many others) Additionally, Gross sales Forecasting is a really well-understood space and makes for good dialog throughout interviews!
That is additionally a great mission to construct out batch mannequin prediction pipelines for and host outcomes on a Tableau dashboard – the insights from which a merchandiser may resolve their upcoming assortment, or advertising workforce to resolve what offers to push.
Checkout the mission right here.
Fee Fraud Detection
FinTech might be the one area that persistently hires information people, and fraud detection stays one of the crucial widespread use circumstances. The dataset is real-world e-commerce information, and the dialogue board is affected by instructions on function engineering.
Checkout this mission right here.
Quora Insincere Query Identification
A mission to showcase your NLP information, together with text-cleaning, dealing with embeddings, and extracting semantic which means. In contrast to typical NLP initiatives, this mission supplies ample room to investigate errors, dive deep into peculiarities of the English language, make hypotheses on easy methods to account for these peculiaritie,s and enhance a v1 mannequin. Makes for excellent dialog throughout interviews!
Click on right here to discover the mission.
H&M Style Suggestions
Nice mission to face out within the RecSys area. Ample alternative to be taught primary strategies – content material/collaborative filtering to superior mannequin (Two-tower, WDNs, and many others. As well as, datasets have photographs, permitting to reveal the means to deal with multi-modal information
That is additionally a great mission to create an inference pipeline for i.e. prepare mannequin on information you have got, a buyer with a selected buyer ID hits the mannequin API endpoint and also you serve the client a “touchdown web page” – a set of things personalised to them. It may even construct out a number of carousels like
- Clients who purchased this additionally purchased (”Cross-Promoting”)
- Kinds you would possibly like (primarily based on their preferences)
Click on right here to checkout extra particulars on this mission.
Q5. What are your suggestions for making ready successfully for information science interviews?
These are the 7 areas it’s worthwhile to put together for DS/ML interviews. Every firm makes use of a unique mixture of those areas.
Coding
Likelihood and Statistics
SQL
- In case you are completely new to SQL, begin with SQL 50
- If you understand your approach round SQL, try DataLemur SQL Interview Questions
Machine Studying
Perceive the fundamentals which embody:
- Function Engineering and Choice: Understanding lacking worth imputations, normalization/scaling, and few function choice strategies.
- Bias & Variance: Overfitting/Underfitting. Perceive easy methods to resolve between fashions primarily based on principle Know totally different regularization strategies and the influence of every.
- Loss Features: Sure, it’s worthwhile to know the formulae of MSE, MAE, Log-Loss, and many others.
- Linear Regression, Logistic Regression, Tree fashions, k-means: What are the mannequin assumptions and the way do you resolve when to use what? The Finest studying useful resource, for my part, is Introduction to Statistical Studying
Deep Studying
- Perceive the fundamentals corresponding to optimizers, loss operate, and primary architectures (MLP, CNN, RNN).
- One of the best studying useful resource, for my part, is Deep Studying by Ian Goodfellow
Case Research
A case examine spherical may be fairly broad, e.g. “Assume you’re a Information Scientist at Etsy. You wish to enhance the add-to-card charge. How would you go about it?”
The easiest way to method such a query is to have a framework:
- Make clear and slim the issue
- Outline key enterprise metrics
- Determine applicable ML formulation
- Align mannequin metrics with enterprise objectives
- Recommend preliminary fashions
- Clarify productionization technique
- Define A/B testing plan
For follow try this video from Emma Ding and this playlist.
Behavioural
- Learn to inform compelling tales and reveal influence
- Begin with Ranges.fyi for interview fundamentals
- For robust culture-fit prep, examine Amazon’s behavioral expectations
Q6. Do information scientists want a robust understanding of knowledge buildings and algorithms? What’s your tackle its significance?
For cracking ‘Utilized Information Scientist’ roles – sure. Because the interview can have a DSA spherical. If in case you wish to know the distinction between totally different roles, verify this article
On your day-to-day, DSA doesn’t play a heavy position.
Nevertheless, I do assume most individuals who code ought to have the ability to remedy LeetCode Medium degree questions. That is due to my private expertise that people who perceive and may apply DSA patterns – dynamic programming, two-pointer, sliding home windows, and many others can higher perceive superior coding patterns – manufacturing facility, design injection, and in a common product higher high quality manufacturing code (i.e. extra readable, maintainable and many others)
Q7. What’s your present GenAI tech stack, and the way do you leverage it to scale and improve your work?
My stack could be fairly easy at work. Github Copilot for coding/debugging and ChatGPT for analysis.
Copilot helps me spend much less time switching between documentation. It’s additionally fairly good at serving to perceive legacy code, particularly breaking down prolonged SQL statements. Its debugging function additionally helps cut back the period of time I spend on StackOverflow.
ChatGPT has been a game-changer. A major period of time is normally spent on deciding what fashions to experiment with for a selected ML drawback. It helps present a great beginning record of strategies to check out, and infrequently I do discover this record to be fairly complete. This decreased the period of time I normally spend researching from days to hours.
Q8. What rising traits in information science and GenAI excite you essentially the most, and the way do you see them influencing the {industry} within the subsequent few years?
The affect of world information included in LLMs to reinforce options. That is very true in Recommender programs the place LLMs have been proven to assist fight chilly begin higher, and produce extra knowledgeable embeddings.
The subsequent pattern is lowering the hole between what customers need and the way customers use the system, particularly for complicated duties. For example, if you wish to plan a celebration and have to purchase issues for it – what would you do?
You’ll consider what all you wanted to purchase and manually seek for them one after the other on Walmart or Amazon. However now? You’ll be able to instantly state your want – ‘celebration necessities’ on this web site and they’ll perceive your intent and supply options for you.
This new alternative of serving ‘broad intent’ is attention-grabbing.
The opposite pattern I’m actively concerned in is ‘Mutli-agent Methods’. LLMs-powered brokers which have been fine-tuned on a selected process and have reasoning capability, which interacts with one other agent to assist customers remedy complicated duties. For instance, if you’re planning a visit you’ll be able to have an agent that takes care of deciding what to do, one other that decides the place to remain, one other that chooses between choices primarily based on finances or security, and many others.
This area is quickly advancing and am excited to see what new improvements happen right here.
Q9. Are you able to describe a state of affairs the place you needed to make essential data-driven choices with incomplete or ambiguous information? How did you navigate it?
Lack of enough information or coping with poor high quality information is a typical problem in manufacturing environments. Earlier than addressing these points, it’s essential to undertake the mindset of understanding your dataset deeply:
- The place does it come from?
- How typically is it up to date?
- Who maintains it?
After getting readability on these facets and the dataset meets your primary necessities, you can begin tackling high quality points.
Bettering information high quality sometimes includes shut collaboration with a number of tech and enterprise groups. Be affected person, ask questions, and purpose for regular progress. If progress stalls, escalate appropriately. If that also doesn’t assist, it’s price reassessing whether or not the trouble is justified.
Ambiguity in decision-making can be widespread, particularly when deploying new fashions to manufacturing that will influence downstream programs. In such circumstances, deal with what may be measured and guarantee any observable influence is both constructive or, at minimal, not damaging. Roll out adjustments in phases to comprise potential damaging results. Over time, with a number of iterations, you’ll develop stronger instincts and frameworks for navigating these choices.
Finish Notice
Karun Thankachan’s journey is a strong blueprint for anybody trying to break into or develop inside the area of knowledge science. His story blends persistence, steady studying, and strategic profession strikes – qualities which can be important in immediately’s aggressive panorama. From navigating rejections to creating essential choices with ambiguous information, Karun’s insights supply each inspiration and sensible recommendation.
For aspiring and early-career information scientists, this interview highlights the significance of constructing sturdy technical foundations, crafting related initiatives, networking deliberately, and making ready holistically for interviews. For professionals trying to scale, Karun’s experiences at high tech firms present a precious lens into how to consider influence, collaboration, and long-term progress.
In case you discovered his views useful and wish to join or be taught extra from him, be happy to achieve out to Karun through LinkedIn.
Login to proceed studying and revel in expert-curated content material.