The post has been translated automatically. Original language: English
A month ago, there was a fork in the road for students of the Big Data / Machine Learning Engineer (BDMLE) course: should they immerse themselves more in Big Data (BD) or Machine Learning (ML)?
We conferred and decided to share with you a litmus test and information on the technology stack. Hopefully, this knowledge will simplify your choice ahead of the new Tech Orda round.
Litmus test: if you like programming more than deducing formulas and counting integrals, then Big Data is more suitable for you. Otherwise, ML.
Stack
Big Data: HDFS, Hadoop, Hive, Spark, Kafka + Spark Structured Streaming, NoSQL (Cassandra), Data Layout (Parquet, ORC, compression), Hadoop 3+ buns.
Machine Learning (basic): numpy, scipy, pandas, sklearn, pytorch, xgboost / lightgbm / catboost.
What would you choose?
🐳 Big Data
⚡️ Machine Learning
Write the answers in the comments
❤️ BD + ML, shake, but do not mix
photo: Artyom
#work #study
A month ago, there was a fork in the road for students of the Big Data / Machine Learning Engineer (BDMLE) course: should they immerse themselves more in Big Data (BD) or Machine Learning (ML)?
We conferred and decided to share with you a litmus test and information on the technology stack. Hopefully, this knowledge will simplify your choice ahead of the new Tech Orda round.
Litmus test: if you like programming more than deducing formulas and counting integrals, then Big Data is more suitable for you. Otherwise, ML.
Stack
Big Data: HDFS, Hadoop, Hive, Spark, Kafka + Spark Structured Streaming, NoSQL (Cassandra), Data Layout (Parquet, ORC, compression), Hadoop 3+ buns.
Machine Learning (basic): numpy, scipy, pandas, sklearn, pytorch, xgboost / lightgbm / catboost.
What would you choose?
🐳 Big Data
⚡️ Machine Learning
Write the answers in the comments
❤️ BD + ML, shake, but do not mix
photo: Artyom
#work #study