How old is colleen farrelly

Episode 31: From Quantum Computing to Epidemic Modeling with Colleen Farrelly

· Datacast

Show Notes:

(1:58) Colleen gave a brief overview of her professional background and her path to data science. (3:27) Colleen explained how her background in medicine and social science contributes to her success as a data scientist working in different domains. ( 5:20) Colleen share her thoughts on how data science varies by sector. (7:02) Referring to her consulting company Staticlysm LLC, Colleen discussed a current medical technology project that she is working on. (8:19) Colleen shared applications of quantum machine learning in the wild, referring to her work at Quantopo - where she is the cofounder and chief mathematician. (12:33) Colleen discussed a new project leveraging quantum and quantum-inspired algorithms for nuclear reactor optimization at Quantopo. (15 : 17) Colleen gave advice for data scientists who want to start a business and get into consulting. (16:58) Colleen discussed topological data analysis in machine learning. (23:02) Colleen discussed epidemic modeling and engaging foreign aid org anizations, given her experience with the Ebola outbreak. (26:01) Colleen discussed the different approaches to model the spread of diseases in epidemics, which follow a differential equation framework. (30:22) Colleen explained why buy-in and corporation from those in power are critical to combat epidemics. (32:20) Colleen explained the difference between the current Coronavirus pandemic and the previous Ebola epidemic (note that this conversation is recorded in mid-March 2020, so information about COVID19 may have dated). (37:06) Colleen shared resources to get up-skilled in data science. (39:40) Colleen talked about the benefits of writing on Quora, where she has written more than 13,000 answers. (41:31) Colleen shared the traits of an excellent technical communicator and / or data translator. (43:26) Colleen shared her process of writing a technical book that focuses on the use-cases of topology, geometry, and graph theory in machine learning and data science. (50: 32) Colleen talked about th e growth of the data science community in Miami. (54:30) Colleen discussed her involvement in data science within the African sectors. (57:55) Colleen shared her thoughts on how the data science field will evolve in the next few years: the access to big data platforms, the dominance of Python, and the rise of quantum computing. (01:01:12) Closing segment.

Her Contact Info:


Her Recommended Resources:

IBM Quantum ComputingXanadu (Quantum Computing Hardware) D-Wave Systems (Quantum Computing Hardware) Ayasdi (Topological Data Analysis-Focused Startup) The SIR Model for Spread of DiseaseGoogle ScholararXivLinkedIn LearningCourseraGenetic Algorithms and Adaptation paper by John HollandRandom Forests paper by Leo BreimanAndrew Ng (Founder of Coursera, DeepLearning.AI, and Landing.AI) Dover Series on Mathematics


  • Show Notes (02:06) Azin described her childhood growing up in Iran and going to a girls-only high school in Tehran designed specifically for extraordinary talents. (05:08) Azin went over her undergraduate experience studying Computer Science at the University of Tehran. (10:41) Azin shared her academic experience getting a Computer Science MS degree at the University of Toronto, supervised by Babak Taati and David Fleet. (14:07) Azin talked about her teaching assistant experience for a variety of CS courses at Toronto. (15:54) Azin briefly discussed her 2017 report titled “Barriers to Adoption of Information Technology in Healthcare,” which takes a system thinking perspective to identify barriers to the application of IT in healthcare and outline the solutions. (19: 35) Azin unpacked her MS thesis called “Subspace Selection to Suppress Confounding Source Domain Information in AAM Transfer Learning,” which explores transfer learning in the context of facial analysis. (28:48) Azin discussed her work as a research assistant at the Toronto Rehabilitation Institute, working on a research project that addressed algorithmic biases in facial detection technology for older adults with dementia. (33:02) Azin has been an Applied Research Scientist at Georgian since 2018, a venture capital firm in Canada that focuses on investing in companies operating in the IT sectors. (38:20) Azin shared the details of her initial Georgian project to develop a robust and accurate injury prediction model using a hybrid instance-based transfer learning method. (42 : 12) Azin unpacked her Medium blog post discussing transfer learning in-depth (problems, approaches, and applications). (48:18) Azin explained how transfer learning could address the widespread “cold-start” problem in the industry. (49 : 50) Azin shared the challenges of working on a fintech platform with a team of engineers at Georgian on various areas such as supervised learning, explainability, and representation learning. (51:46) Azin went over h er project with Tractable AI, a UK-based company that develops AI applications for accident and disaster recovery. (55:26) Azin shared her excitement for ML applications using data-efficient methods to enhance life quality. (57:46) Closing segment .Azin's Contact InfoWebsiteTwitterLinkedInGoogle ScholarGitHubMentioned Content


    “Barriers to Adoption of Information Technology in Healthcare” (2017) “Subspace Selection to Suppress Confounding Source Domain Information in AAM TransferLearning” (2017) “A Hybrid Instance-based Transfer Learning Method” (2018) “Prediction of Workplace Injuries” (2019 ) “Algorithmic Bias in Clinical Populations - Evaluating and Improving Facial Analysis Technology in Older Adults with Dementia” (2019) “Limitations and Biases in Facial Landmark Detection” (2019)

    Blog posts

    “An Introduction to Transfer Learning” (Dec 2018) “Overcoming The Cold-Start Problem: How We Make Intractable Tasks Tractable” (April 2021)


    Yoshua Bengio (Professor of Computer Science and Operations Research at University of Montreal) Geoffrey Hinton (Professor of Computer Science at University of Toronto) Louis-Philippe Morency (Associate Professor of Computer Science at Carnegie Mellon University)


    "Machine Learning: A Probabilistic Approach" (by Kevin Murphy)

    Note: Azin and her collaborator are going to give a talk at ODSC Europe 2021 in June about a Georgian’s project with a portfolio company, Tractable. They have written a short blog post about it too which you can find HERE.

  • Show Notes (02:09) Gordon briefly talked about his undergraduate studying Psychology and Philosophy at Rutgers University in the early 90s. (03:24) Gordon reflected on the first decade of his career getting into database technologies. (05:34) Gordon discussed his predilection towards consulting, specifically his role in the professional services team at AB Initio Software in the early 2000s. (08:02) Gordon recalled the challenges of leading data warehousing initiatives at Smarter Travel Media and ClickSquared in the 2000s. (13: 14) Gordon emphasized the advantage of a multi-tenant database over a traditional relational database. (18:30) Gordon recalled his one-year stint at Cervello, leading business intelligence implementations for their clients. (21:59) Gordon elaborated on his projects during his 3 years as the director of business intelligence infrastructure at Fitbit. (26:09) Gordon dived into his framework of choosing data tooling vendors while at Fitbit (and how he settled with a tiny st artup called Snowflake back then). (30:02) Gordon provided recommendations for startups to be data-driven. (33:24) Gordon recalled practices to foster effective collaboration while managing the 3 teams of data engineering, data warehousing, and data analytics at Fitbit. (36:44) Gordon went over his proudest accomplishment as the director of data engineering at ezCater, making substantial improvements to their data warehouse platform. (38:59) Gordon shared his framework for interviewing data engineers. (41:39 ) Gordon walked through his consulting engagement in analytics engineering for Zipcar and data warehousing for edX. (46:17) Gordon reflected on his time as the Vice President of business intelligence at HubSpot. (50:50) Gordon unpacked his notion of “Data Hierarchy of Needs, ”which entails the five pillars - data security, data quality, system reliability, user experience, and data coverage. (56:55) Gordon discussed current opportunities for driving better social outcomes and empowering dem ocracy through data. (59:48) Gordon shared the key criteria that enable healthy team dynamics from his hands-on experience building data teams. (01:02:13) Gordon unpacked the central features and benefits of Snowflake for the un-initiated . (01:06:25) Gordon gave his verdict for the ETL tooling landscape in the next few years. (01:08:33) Gordon described the data community in Boston. (01:09:52) Closing segment.Gordon's Contact InfoLinkedInMentioned Content


    Tristan Handy (co-founder of Fishtown Analytics and co-creator of dbt) Michael Kaminsky (who coined the term “Analytics Engineering”) Barr Moses (co-founder and CEO of Monte Carlo, who coined the term “Data Observability”)


    “Start With Why” (By Simon Sinek)
  • Missing consequences?

    Click here to update the feed.

  • Show Notes (2:05) Louis went over his childhood as a self-taught programmer and his early days in school as a freelance developer. (4:22) Louis described his overall undergraduate experience getting a Bachelor's degree in IT Systems Engineering from Hasso Plattner Institute, a highly-ranked computer science university in Germany. (6:10) Louis dissected his Bachelor thesis at HPI called “Differentiable Convolutional Neural Network Architectures for Time Series Classification,” - which addresses the problem of automatically designing architectures for time series classification efficiently, using a regularization technique for ConvNet that enables joint training of network weights and architecture through back-propagation. (7:40) Louis provided a brief overview of his publication “Transfer Learning for Speech Recognition on a Budget,” - which explores Automatic Speech Recognition training by model adaptation under constrained GPU memory, throughput, and training data. (10:31) Louis described his one-year Master of Research degree in Computational Statistics and Machine Learning at the University College London supervised by David Barber. (12:13) Louis unpacked his paper “Modular Networks: Learning to Decompose Neural Computation,” published at NeurIPS 2018 - which proposes a training algorithm that flexibly chooses neural modules based on the processed data. (15:13) Louis briefly reviewed his technical report, “Scaling Neural Networks Through Sparsity,” which discusses near-term and long-term solutions to handle sparsity between neural layers. (18:30) Louis mentioned his report, “Characteristics of Machine Learning Research with Impact,” which explores questions such as how to measure research impact and what questions the machine learning community should focus on to maximize impact ) Louis explained his report, "Contemporary Challenges in Artificial Intelligence," which covers lifelong learning, scalability, generalization, self-referential algorithms, and benchmarks. (23:16) Louis talked about his motivation to start a blog and discussed his two-part blog series on intelligence theories (part 1 on universal AI and part 2 on active inference). (27:46) Louis described his decision to pursue a Ph.D. at the Swiss AI Lab IDSIA in Lugano, Switzerland, where he has been working on Meta Reinforcement Learning agents with Jürgen Schmidhuber. (30:06) Louis created a very extensive map of reinforcement learning in 2019 that outlines the goal, methods, and challenges associated with the RL domain. (33:50) Louis unpacked his blog post reflecting on his experience at NeurIPS 2018 and providing updates on the AGI roadmap regarding topics such as scalability, continual learning, meta-learning, and benchmarks. (37:04 ) Louis dissected his ICLR 2020 paper “Improving Generalization in Meta Reinforcement Learning using Learned Objectives,” which introduces a novel algorithm called MetaGenRL, inspired by biological evolution. (44:03) Louis elaborated on his publication “Meta-Learning Backpropagation And Improving It , "Which introduces the Variable Shared Meta-Learning framework that unifies existing meta-learning approaches and demonstrates that simple weight-sharing and sparsity in a network are suffic ient to express powerful learning algorithms. (51:14) Louis expands on his idea to bootstrap AI that entails how the task, the general meta learner, and the unsupervised objective should interact (proposed at the end of his invited talk at NeurIPS 2020) . (54:14) Louis shared his advice for individuals who want to make a dent in AI research. (56:05) Louis shared his three most useful productivity tips. (58:36) Closing segment.Louis's Contact InfoWebsiteTwitterLinkedInGoogle ScholarGitHubMentioned Content

    Papers and Reports

    Differentiable Convolutional Neural Network Architectures for Time Series Classification (2017) Transfer Learning for Speech Recognition on a Budget (2017) Modular Networks: Learning to Decompose Neural Computation (2018) Contemporary Challenges in Artificial Intelligence (2018) Characteristics of Machine Learning Research with Impact ( 2018) Scaling Neural Networks Through Sparsity (2018) Improving Generalization in Meta Reinforcement Learning using Learned Objectives (2019) Meta-Learning Backpropagation And Improving It (2020)

    Blog posts

    Theories of Intelligence - Part 1 and Part 2 (July 2018) Modular Networks: Learning to Decompose Neural Computation (May 2018) How to Make Your ML Research More Impactful (Dec 2018) A Map of Reinforcement Learning (Jan 2019) NeurIPS 2018, Updates on the AI ​​Roadmap (Jan 2019) MetaGenRL: Improving Generalization in Meta Reinforcement Learning (Oct 2019) General Meta-Learning and Variable Sharing (Nov 2020)


    Jeff Clune (for his push on meta-learning research) Kenneth Stanley (for his deep thoughts on open-ended learning) Jürgen Schmidhuber (for being a visionary scientist)


    "Grit" (by Angela Duckworth)
  • Show Notes (01:58) Dzejla described her undergraduate experience studying Computer Science at the Sarajevo School of Science and Technology back in the mid-2000s. (07:59) Dzejla recapped her overall experience getting a Ph.D. in Computer Science at Stony Brook University. (14:38) Dzejla unpacked the key research problem in her Ph.D. thesis titled “Upper and Lower Bounds on Sorting and Searching in External Memory.” (19:13) Dzejla went over the details of her paper “Don't Thrash: How to Cache Your Hash on Flash,” - which describes the Cascade Filter , an approximate-membership-query data structure that scales beyond main memory, that is an alternative to the well-known Bloom-filter data structure. (24:41) Dzejla elaborated on her work "The batched predecessor problem in external memory," - which studies the lower bounds in three external memory models: the I / O comparison model, the I / O pointer-machine model, and the index-ability model. (29:56) Dzejla shared her learnings from being a teaching assistant for the Introduction to Algorithms course at Stony Brook (both at the undergraduate and graduate level). (35:08) Dzejla went over her summer internships at Microsoft's Server and Tools Division during her Ph.D. (41:06) Dzejla reasoned about her decision to return to Sarajevo School of Science and Technology as an Assistan t Professor of Computer Science. (47:22) Dzejla dissected the essential concepts and methods covered in her Data Structures, Introductory Algorithms, Advanced Algorithms, and Algorithms for Big Data courses taught at SSIT. (48:42) Dzejla provided a brief overview of the Computer Science / Software Engineering department at the International University of Sarajevo (where she has been a professor since 2017. (50:57) Dzejla briefly talked about the courses that she taught at IUS, including Intro to Programming, Human-Computer Interaction , and Algorithms / Data Structures. (52:49) Dzejla shared the challenges of writing Algorithms and Data Structures for Massive Datasets, which introduces data processing and analytics techniques specifically designed for large distributed datasets. (56:14) Dzejla explained concepts in Part 1 of the book - including Hash Tables, Approximate Membership, Bloom Filters, Frequency / Cardinality Estimation, Count-Min Sketch, and Hyperloglog. (58:38) Dzejla provided a brief overview of techniques to handle streaming data in Part 2 of the book. (01:00:14) Dzejla mentioned the data structures for large databases and external-memory algorithms in Part 3 of the book. (01:02:15) Dzejla shared her thoughts about the tech community in Sarajevo. (01:04:16) Closing segment.Dzejla's Contact InfoLinkedInTwitterGoogle ScholarMentioned Content


    “Upper and Lower Bounds on Sorting and Searching in External Memory” (Dzejla's Ph.D. Thesis, 2014) “Don't Thrash: How to Cache Your Hash on Flash” (2012) “The batched predecessor problem in external memory” ( 2014)


    Erik Demaine (Computer Science Professor at MIT) Michael Bender (Computer Science Professor at Stony Brook, Dzejla's Ph.D. Advisor) Joseph Mitchell (Computational Geometry Professor at Stony Brook) Steven Skiena (Computer Science Professor at Stony Brook) Jeff Erickson (Computer Science Professor at UIUC)


    “Algorithms and Data Structures for Massive Datasets” (by Dzejla Medjedovic, Emin Tahirovic, and Ines Dedovic) “The Algorithm Design Manual” (by Steven Skiena)

    Here is a permanent 40% discount code (good for all Manning products in all formats) for Datacast listeners: poddcast19. Link at

    Here is one free eBook code good for a copy of Algorithms and Data Structures for Massive Datasets for a lucky listener: algdcsr-7135. Link at

  • Show Notes (1:45) Willem discussed his undergraduate degree in Mechatronic Engineering at Stellenbosch University in the early 2010s. (2:34) Willem recalled his entrepreneurial journey founding and selling a networking startup that provides internet access to private residents on campus. ( 5:37) Willem worked for two years as a Software Engineer focusing on data systems at Systems Anywhere in Capetown after college. (6:49) Willem talked about his move to Bangkok working as a Senior Software Engineer at INDEFF, a company in industrial control systems. (9:52) Willem went over his decision to join Gojek, a leading Indonesian on-demand multi-service platform and digital payment technology group. (12:16) Willem mentioned the engineering challenges associated with building complex data systems for super-apps. (14:50) Willem dissected Gojek's ML platform, including these four solutions for various stages of the ML life cycle: Clockwork, Merlin, Feast, and Turing. (19:24) Willem recapped the lessons fr om designing the ML platform to meet Gojek's scaling requirements - as delivered at Cloud Next 2018. (23:09) Willem briefly went through the key design components to incorporate Kubeflow pipelines into Gojek's existing ML platform - as delivered at KubeCon 2019. (26: 21) Willem explained the inception of Feast, an open-source feature store that bridges the gap between data and models. (32:20) Willem talked about prioritizing the product roadmap and engaging the community for an open-source project. (35: 07) Willem recapped the key lessons learned and envisioned Feast's future to be a lightweight modular feature store. (37:29) Willem explained the differences between commercial and open-source feature stores (given Tecton's recent backing of Feast). (41:36 ) Willem reflected on his experience living and working in Southeast Asia. (44:33) Closing segment.Willem's Contact InfoTwitterLinkedInGitHubMentioned Content


    Feast Project website: feast.devFeast Slack community: #FeastFeast Documentation: docs.feast.devFeast GitHub repository: feast-dev / feastFeast on StackOverflow: Wiki: / Feast + HomeFeast Twitter: @feast_dev


    An Introduction to Gojek’s Machine Learning Platform (2019) Introducing Feast: An Open-Source Feature Store For Machine Learning (2019) A State of Feast (2020) Why Tecton is Backing The Feast Open-Source Feature Store (2020)


    Lessons Learned Scaling Machine Learning at GoJek on Google Cloud (Cloud Next 2018) Accelerating Machine Learning App Development with Kubeflow Pipelines (Cloud Next 2019) Moving People and Products with Machine Learning on Kubeflow (KubeCon 2019)


    David Aronchick (Open-Source ML Strategy at Azure, Ex-PM for Kubernetes at Google, Co-Founder of Kubeflow, Advisor to Tecton) Jeremy Lewi (Principal Engineer at, Co-Founder of Kubeflow) Felipe Hoffa (Developer Advocate for BigQuery, Data Cloud Advocate for Snowflake)


    Cal Newport’s "Deep Work"

    Willem will be a speaker at Tecton’s apply () virtual conference (April 21-22, 2021) for data and ML teams to discuss the practical data engineering challenges faced when building ML for the real world. Participants will share best practice development patterns, tools of choice, and emerging architectures they use to successfully build and manage production ML applications. Everything is on the table from managing labeling pipelines, to transforming features in real-time, and serving at scale. Register for free now:!

  • Show Notes (1:56) Jim went over his education at Trinity College Dublin in the late 90s / early 2000s, where he got early exposure to academic research in distributed systems. (4:26) Jim discussed his research focused on dynamic software architecture , particularly the K-Component model that enables individual components to adapt to a changing environment. (5:37) Jim explained his research on collaborative reinforcement learning that enables groups of reinforcement learning agents to solve online optimization problems in dynamic systems. (9: 03) Jim recalled his time as a Senior Consultant for MySQL. (9:52) Jim shared the initiatives at the RISE Research Institute of Sweden, in which he has been a researcher since 2007. (13:16) Jim dissected his peer- to-peer systems research at RISE, including theoretical results for search algorithm and walk topology. (15:30) Jim went over challenges building peer-to-peer live streaming systems at RISE, such as GradientTV and Glive. (18:18) Jim provided an overvi ew of research activities at the Division of Software and Computer Systems at the School of Electrical Engineering and Computer Science at KTH Royal Institute of Technology. (19:04) Jim has taught courses on Distributed Systems and Deep Learning on Big Data at KTH Royal Institute of Technology. (22:20) Jim unpacked his O'Reilly article in 2017 called “Distributed TensorFlow,” which includes the deep learning hierarchy of scale. (29:47) Jim discussed the development of HopsFS, a next-generation distribution of the Hadoop Distributed File System (HDFS) that replaces its single-node in-memory metadata service with a distributed metadata service built on a NewSQL database. (34:17) Jim rationalized the intention to commercialize HopsFS and built Hopsworks, an user-friendly data science platform for Hops. (36:56) Jim explored the relative benefits of public research money and VC-funded money. (41:48) Jim unpacked the key ideas in his post “Feature Store: The Missing Data Layer in ML Pipelines . ”(47 : 31) Jim dissected the critical design that enables the Hopsworks feature store to refactor a monolithic end-to-end ML pipeline into separate feature engineering and model training pipelines. (52:49) Jim explained why data warehouses are insufficient for machine learning pipelines and why a feature store is needed instead. (57:59) Jim discussed prioritizing the product roadmap for the Hopswork platform. (01:00:25) Jim hinted at what's on the 2021 roadmap for Hopswork. (01:03:22) Jim recalled the challenges of getting early customers for Hopsworks. (01:04:30) Jim intuited the differences and similarities between being a professor and being a founder. (01:07:00) Jim discussed worrying trends in the European Tech ecosystem and the role that Logical Clocks will play in the long run. (01:13:37) Closing segment.Jim's Contact InfoLogical ClocksTwitterLinkedInGoogle ScholarMediumACM ProfileGitHubMentioned Content

    Research Papers


    “Distributed TensorFlow” (2017) “Reflections on AWS's S3 Architectural Flaws” (2017) “Meet Michelangelo: Uber's Machine Learning Platform” (2017) “Feature Store: The Missing Data Layer in ML Pipelines” (2018) “What Is Wrong With European Tech Companies? " (2019) “ROI of Feature Stores” (2020) “MLOps With A Feature Store” (2020) “ML Engineer Guide: Feature Store vs. Data Warehouse” (2020) “Unifying Single-Host and Distributed Machine Learning with Maggy” ( 2020) “How We Secure Your Data With Hopsworks” (2020) “One Function Is All You Need For ML Experiments” (2020) “Hopsworks: World's Only Cloud-Native Feature Store, now available on AWS and Azure” (2020) “ Hopsworks 2.0: The Next Generation Platform for Data-Intensive AI with a Feature Store ”(2020)“ Hopsworks Feature Store API 2.0, a new paradigm ”(2020)“ Swedish startup Logical Clocks takes a crack at scaling MySQL backend for live recommendations ” (2021)


    Apache Hudi (by Uber) Delta Lake (by Databricks) Apache Iceberg (by Netflix) MLflow (by Databricks) Apache Flink (by The Apache Foundation)


    Leslie Lamport (The Father of Distributed Computing) Jeff Dean (Creator of MapReduce and TensorFlow, Lead of Google AI) Richard Sutton (The Father of Reinforcement Learning - who wrote “The Bitter Lesson”)

    Programming Books

    C ++ Programming Languages ​​books (by Scott Meyers) “Effective Java” (by Joshua Bloch) “Programming Erlang” (by Joe Armstrong) “Concepts, Techniques, and Models of Computer Programming” (by Peter Van Roy and Seif Haridi)
  • Show Notes (2:20) Pier shared his college experience at the University of Southampton studying Electronic Engineering. (3:46) For his final undergraduate project, Pier developed a suite of games and used machine learning to analyze brainwaves data that can classify whether a child is affected or not by autism. (11:26) Pier went over his favorite courses and involvement with the AI ​​Society during his additional year at the University of Southampton to get a Master's in Artificial Intelligence. (13:40) For his Master's thesis called “Causal Reasoning in Machine Learning,” Pier created and deployed a suite of Agent-Based and Compartmental Models to simulate epidemic disease developments in different types of communities. (26:51) Pier went over his stints as a developer intern at Fidessa and a freelance data scientist at Digital-Dandelion. (29:21) Pier reflected on his time (so far) as a data scientist at SAS Institute, where he helps their customers solve various data-driven challenges using cl oud-based technologies and DevOps processes. (33:37) Pier discussed the key benefits that writing and editing technical content for Towards Data Science to his professional development. (36:31) Pier covered the threads that he kept pulling with his blog posts . (38:50) Pier talked about his Augmented Reality Personal Business Card created in HTML using the AR.js library. (41:12) Pier brought up data structures in two other impressive JavaScript projects using TensorFlow.js and ml5.js. (44:19) Pier went over his experience working with data visualization tools such as Plotly, R Shiny, and Streamlit. (47:27) Pier talked about his work on a chapter for a book called “Applied Data Science in Tourism” that is going to be published with Springer this year. (48:37) Pier shared his thoughts regarding the tech community in London. (49:19) Closing segment.Pier's Contact InfoWebsiteLinkedInTwitterGitHubMediumPatreonKaggleMentioned Content “Alleviate Children's Health Issues Through Games and Machine Learni ng ”“ Causal Reasoning in Machine Learning ”Andrej Karpathy (Director of AI and Autopilot at Tesla) Cassie Kozyrkov (Chief Decision Scientist at Google) Iain Brown (Head of Data Science at SAS)“ The Book Of Why ”(By Judea Pearl) “Pattern Recognition and Machine Learning” (by Christopher Bishop)

  • Timestamps

    (1:55) Alba shared her background growing up interested in studying Physics and pivoting into quantum mechanics. (3:33) Alba went over her Bachelor's in Fundamental Physics at The University of Barcelona. (4:54) Alba continued her education with to MS degree that specialized in Particle Physics and Gravitation. (6:40) Alba started her Ph.D. in Physics in 2015 and discussed her first publication, “Operational Approach to Bell Inequalities: Application to Qutrits.” (9:48) Alba also spent time as a visiting scholar at the University of Oxford and the University of Madrid during her Ph.D . (11:25) Alba explained her second paper to understand the connection between maximal entanglement and the fundamental symmetries of high-energy physics. (13:27) Alba dissected her next work titled “Multipartite Entanglement in Spin Chains and The Hyperdeterminant.” (18:56) Alba shared the origin of Quantic, a quantum computation joint effort between the University of Barcelona and the Barcelona Supercomputing Center. (22:27) Alba unpacked her article “Quantum Computation: Playing The Quantum Symphony,” making a metaphor between quantum computing and musical symphony. (27:47) Alba discussed the motivation and contribution of her paper “Exact Ising Model Simulation On A Quantum Computer.” (32:51) Alba recalled creating a tutorial that ended up winnin g the Teach Me QISKit challenge from IBM back in 2018. (35:01) Alba elaborated on her paper “Quantum Circuits For the Maximally Entangled States,” which designs a series of quantum circuits that generate absolute maximally entangled states to benchmark a quantum computer . (38:54) Alba dissected key ideas in her paper “Data Re-Uploading For a Universal Quantum Classifier.” (43:51) Alba explained how she leveled up her knowledge of classical neural networks. (47:40) Alba shared her experience as a Postdoctoral Fellow at The Matter Lab at the University of Toronto - working on quantum machine learning and variational quantum algorithms (checked out the Quantum Research Seminars Toronto that she has been organizing). (52:18) Alba explained her work on the Meta-Variational Quantum Eigensolver algorithm capable of learning the ground state energy profile of a parametrized Hamiltonian. (59:23) Alba went over Tequila, a development package for quantum algorithms in Python that her group created. (01: 04:49) Alba presented a quantum calling for new algorithms, applications, architectures, quantum-classical interface, and more (as presented here). (01:08:59) Alba has been active in education and public outreach activities about encouraging scientific vocations for young minds, especially in Catalonia. (01:12:07) Closing segment.

    Her Contact Info

    WebsiteTwitterLinkedInGoogle ScholarGitHub

    Her Recommended Resources

    Ewin Tang (Ph.D. Student in Theoretical Computer Science at the University of Washington) Alán Aspuru-Guzik (Professor of Chemistry and Computer Science at the University of Toronto, Alba's current supervisor) José Ignacio Latorre (Professor of Theoretical Physics at the University of Barcelona, ​​Alba's former supervisor) Quantum Computation and Quantum Information (by Michael Nielsen and Isaac Chuang) Quantum Field Theory and The Standard Model (by Matthew Schwarz) The Structure of Scientific Revolutions (by Thomas Kuhn) Against Method (by Paul Feyerabend) Quantum Computing Since Democritus (by Scott Aaronson)
  • Timestamps (2:07) JY discussed his college time studying Computer Science and Applied Math at Ecole Polytechnique - a leading French institute in science and technology. (3:04) JY reflected on time at Stanford getting a Master's in Management Science and Engineering, where he served as a Teaching Assistant for CS 229 (Machine Learning) and CS 246 (Mining Massive Datasets). (6:14) JY walked over his ML engineering internship at LiveRamp - a data connectivity platform for the safe and effective use of data . (7:54) JY reflected on his next three years at Databricks, first as a software engineer and then as a tech lead for the Spark Infrastructure team. (10:00) JY unpacked the challenges of packaging / managing / monitoring Spark clusters and automating the launch of hundreds of thousands of nodes in the cloud every day. (14:48) JY shared the founding story behind Data Mechanics, whose mission is to give superpowers to the world's data engineers so they can make sense of their data and build applica tions at scale on top of it. (18:09) JY explained the three tenets of Data Mechanics: (1) managed and serverless, (2) integrated into clients' workflows, and (3) built on top of open-source software (read the launch blog post). (22:06) JY unpacked the core concepts of Spark-On-Kubernetes and evaluated the benefits / drawbacks of this new deployment mode - as presented in “Pros and Cons of Running Apache Spark on Kubernetes. ”(26:00) JY discussed Data Mechanics' main improvements on the open-source version of Spark-On-Kubernetes - including an intuitive user interface, dynamic optimizations, integrations, and security - as explained in“ Spark on Kubernetes Made Easy. "(28:35) JY went over Data Mechanics Delight, a customized Spark UI which was recently open-sourced. (35:40) JY shared the key ideas in his thought-leading piece on how to be successful with Apache Spark in 2021 . (38:42) JY went over his experience going through the Y Combinator program in summer 2019. (40:56) JY reflected on the ke y decisions to get the first cohort of customers for Data Mechanics. (42:26) JY shared valuable hiring lessons for early-stage startup founders. (44:34) JY described the data and tech community in France. (47:19) Closing segment.

    His Contact Info

    TwitterLinkedInData Mechanics

    His Recommended Resources

    Jure Leskovec (Associate Professor of Computer Science at Stanford / Chief Scientist at Pinterest) Jeff Bezos (Founder of Amazon) Matei Zaharia (CTO of Databricks and creator of Apache Spark) “Designing For Data-Intensive Applications” (by Martin Kleppmann)
  • Timestamps (2:55) Chris went over his experience studying Computer Science at the University of Southern California for undergraduate in the late 90s. (5:26) Chris recalled working as a Software Engineer at NASA Jet Propulsion Lab in his sophomore year at USC (9:54) Chris continued his education at USC with an MS and then a Ph.D. in Computer Science. Under the guidance of Dr. Nenad Medvidović, his Ph.D. thesis is called "Software Connectors For Highly-Distributed And Voluminous Data-Intensive Systems." He proposed DISCO, a software architecture-based systematic framework for selecting software connectors based on eight key dimensions of data distribution. (16:28) Towards the end of his Ph.D., Chris started getting involved with the Apache Software Foundation. More specifically, he developed the original proposal and plan for Apache Tika (a content detection and analysis toolkit) in collaboration with Jérôme Charron to extract data in the Panama Papers, exposing how wealthy individuals exploited offshore tax regimes. (24:58) Chris discussed his process of writing “Tika In Action,” which he co-authored with Jukka Zitting in 2011. (27:01) Since 2007, Chris has been a professor in the Department of Computer Science at USC Viterbi School of Engineering. He went over the principles covered in his course titled “Software Architectures.” (29:49) Chris touched on the core concepts and practical exercises that students could gain from his course “Information Retrieval and Web Search Engines.” (32:10) Chris continued with his advanced course called “Content Detection and Analysis for Big Data” in recent years (check out this USC article). (36:31) Chris also served as the Director of the USC's Information Retrieval and Data Science group, whose mission is to research and develop new methodology and open source software to analyze, ingest, process, and manage Big Data and turn it into information. (41:07) Chris unpacked the evolution of his career at NASA JPL: Member of Technical Staff -> Senior Software Architect -> Principal Data Scientist -> Deputy Chief Technology and Innovation Officer -> Division Manager for the AI, Analytics,and Innovation team. (44:32) Chris dove deep into MEMEX - a JPL's project that aims to develop software that advances online search capabilities to the deep web, the dark web, and nontraditional content. (48:03) Chris briefly touched on XDATA - a JPL's research effort to develop new computational techniques and open-source software tools to process and analyze big data. (52:23) Chris described his work on the Object-Oriented Data Technology platform, an open-source data management system originally developed by NASA JPL and then donated to the Apache Software Foundation. (55:22) Chris shared the scientific challenges and engineering requirements associated with developing the next generation of reusable science data processing systems for NASA's Orbiting Carbon Observatory space mission and the Soil Moisture Active Passive earth science mission. (01:01:05) Chris talked about his work on NASA's Machine Learning-based Analytics for Autonomous Rover Systems - which consists of two novel capabi lities for future Mars rovers (Drive-By Science and Energy-Optimal Autonomous Navigation). (01:04:24) Chris quantified the Apache Software Foundation's impact on the software industry in the past decade and discussed trends in open-source software development. (01:07:15) Chris unpacked his 2013 Nature article called “A vision for data science” - in which he argued that four advancements are necessary to get the best out of big data: algorithm integration, development and stewardship, various data formats , and people power. (01:11:54) Chris revealed the challenges of writing the second edition of “Machine Learning with TensorFlow,” a technical book with Manning that teaches the foundational concepts of machine learning and the TensorFlow library's usage to build powerful models rapidly. (01:15:04) Chris mentioned the differences between working in academia and industry. (01:16:20) Chris described the tech and data community in the greater Los Angeles area. (01:18:30) Closing segment.His Conta ct InfoWikipediaNASA PageGoogle ScholarUSC PageTwitterLinkedInGitHubHis Recommended ResourcesDoug Cutting (Founder of Lucene and Hadoop) Hilary Mason (Ex Data Scientist at and Cloudera) Jukka Zitting (Staff Software Engineer at Google) "The One Minute Manager" (by Ken Blanchard and Spencer Johnson )

  • Show Notes (2:09) Marcello described his academic experience getting a Master's Degree in Computer Science from the Universita di Catania in the early 2000s, where his thesis is called Evolutionary Randomized Graph Embedder. (6:14) Marcello commented on his career phase working as a web developer across various places in Europe. (9:18) Marcello discussed his time working as a software engineer at INPS, a government-owned company that now handles most Italian citizens' pubic-related data. (10:42) Marcello talked about his time as a data visualization engineer at SwiftIQ. He created a data visualization library that allows the inclusion of dynamic charts in HTML pages with just a few JavaScript lines. (13:40) Marcello went over his projects while working as a full-stack software engineer for Twitter's User Services Engineering team in Dublin . (17:19) Marcello reflected on his time at Microsoft Zurich's Social and Engagement team, contributing to machine learning infrastructure and tools. (21:28) Marcello briefly touched on his one-year stint at Apple Zurich as a Senior Applied Research Engineer . (23:49) Marcello talked about the challenges while writing “Algorithms and Data Structures in Action,” which introduces a diverse range of algorithms used in web apps, systems programming, and data manipulation. (27:11) Marcello expanded upon part 1 of the book, including advanced data structures such as D-ary Heaps, Randomized Treaps, Bloom Filters, Disjoint Sets, Tries / Radix Trees, and Cache. (34:51) Marcello brought up data structures to perform efficient multi-dim ensional queries, including various nearest neighbor searches and clustering techniques, in part 2 of the book. (39:21) Marcello briefly described the algorithms in part 3 of the book - graph embeddings, gradient descent, simulated annealing, and genetic algorithms. ( 48:28) Marcello talked about his work on jsgraph - a lightweight library to model graphs, run graphs algorithms, and display them on screen. (52:06) Marcello compared Python, Java, and JavaScript programming languages. (54:13) Marcello discussed his current interest in quantum computing. (56:18) Marcello shared his thoughts regarding Dublin, Zurich, and Rome's tech communities. (57:37) Closing segment.His Contact InfoTwitterLinkedInGitHubBlogHis Recommended Resources "Algorithms and Data Structures in Action" ( Marcello's book with Manning) Andrew NgGeoffrey HintonFrancois Chollet "Scalability Rules" (by Martin Abbott and Michael Fischer)

    This is the 40% discount code that is good for all Manning’s products in all formats: poddcast19.

    These are 5 free eBook codes, each good for one copy of "Algorithms and Data Structures in Action":