Data & Science with Glen Wright Colopy
Keith O’Rourke | The Logic of Statistics

Keith O’Rourke | The Logic of Statistics

August 2, 2022

Keith O'Rourke | The Logic of Statistics

Dr. Keith O'Rourke talks about the logical reasoning behind statistical modeling. Topics include mathematical vs scientific reasoning, whether science has become too stats focused, and vice versa.

Watch it on...
Youtube: https://youtu.be/FqE4ROHBKpY
Podbean: https://dataandsciencepodcast.podbean.com/e/keith-o-rourke-the-logic-of-statistics/

 

Topic List:

0:00 - The logic of statistics
0:30 - What is scientific statistics?
5:15 - The logic of statistics and CS Pierce
9:15 - Role of representation in statistics: explicit vs implicit
14:13 - Diagrammatic Reasoning
18:45 - Why is modeling counterfactual?
19:33 - How can statisticians become better scientists?
28:40 - Science is hard
31:24 - Computational approaches to learning
42:00 - Learning through metaphor
46:28 - Diagrammatic representations vs math
48:40 - Is science too statistics-focussed? 
59:35 - Is statistics sufficiently science-focussed? 
1:08:40 - Scientific Debate

 

#statistics #datascience #science 

Jack Fitzsimons | Evil Models: Hiding Malware in Neural Networks

Jack Fitzsimons | Evil Models: Hiding Malware in Neural Networks

July 25, 2022

Jack Fitzsimons | Evil Models: Hiding Malware in Neural Networks

Did you know that it's possible to hide malware in neural networks? Actually, you can hide malware in many statistical models. This is the subject of two recently-published papers (aptly titled "EvilModel" & "EvilModel 2.0"). Dr. Jack Fitzsimons makes it easy to understand how this is done, using techniques that began long before computers.  

 

Watch or listen on... 
Youtube: https://youtu.be/QBnk8ogL8Nk
Podbean: https://dataandsciencepodcast.podbean.com/e/jack-fitzsimons-evil-models-hiding-malware-in-neural-networks/

Scott Cunningham | Causal Inference (The Mixtape)

Scott Cunningham | Causal Inference (The Mixtape)

July 17, 2022

Scott Cunningham | Causal Inference (The Mixtape)
Scott Cunningham (Baylor University) discusses the ideas of his book "Causal Inference: The Mixtape". Topics include trusting inference in the absence of counterfactuals and the challenges of apply scientific methods to social phenomena. 

Watch it on...
YouTube: https://youtu.be/yNaCudDVTkY
Podbean: https://dataandsciencepodcast.podbean.com/e/scott-cunningham-causal-inference-the-mixtape/

0:00 - COMING UP...
0:35 - What makes it into the mixed tape?
7:10 - Coding to learn
11:15 - More people are expected to work with data & code
12:50 - Design vs program vs estimators
20:40 - Causation with zero correlation
27:00 - Optimization make everything endogenous
28:45 - The hospital example
29:30 - Credible scientific discovery vs motivated discovery
39:55 - Different meanings of causality
43:30 - The impossible counterfactual 
47:00 Counterfactual nihilism
49:20 Social experiments / Defund the police
53:35 - Skepticism about the science of social phenomena
1:05:20 - The Italian crime example
1:16:30 - Scientific debate

 

Eric Daza | Important Ideas in Causal Inference

Eric Daza | Important Ideas in Causal Inference

July 10, 2022

Eric Daza | Important Ideas in Causal Inference

YouTube: https://youtu.be/K5nsSMJVIT0

Andrew Gelman and Aki Vehtari wrote a paper titled, "What are the most important statistical ideas of the past 50 years?". The first idea in the list is "counterfactual causal inference". Eric Daza (Evidation Health) walks us through the main ideas of the Gelman & Vehtari paper, drawing examples from several fields, including medical & healthcare statistics. 

Topics
0:00 - Coming up...Correlation vs Causation
1:20 - Most important statistical ideas over the last 50 years
6:10 - Counterfactual Causal Inference
9:40 - Assumptions Change between Applied Domains
21:10 - Propensity Score Methods
25:15 - Transportability of Scientific Results 
26:30 - People don't want generalizable results
32:00 - Generic Computation Algorithms
37:00 - Reweighting
43:57 - Matching Methods
58:20 - Medical Data is Higher Dimensional that we think.
1:00:15 - Is a Trial Population Representative? 
1:10:35 - Causal Models in the Future
1:18:45 - Apostates Welcome
1:21:45 - Scientific Debate

 

 

Wenting Cheng & Weidong Zhang | Advances in Biotech/Biopharma

Wenting Cheng & Weidong Zhang | Advances in Biotech/Biopharma

May 9, 2022

Wenting and Weidong discuss how the statistical challenges in the biopharm industry have proliferated with the unique demands of biotech and related life science industries.

Ruda Zhang | Gaussian Process Subspace Regression

Ruda Zhang | Gaussian Process Subspace Regression

May 9, 2022

Ruda Zhang | Gaussian Process Subspace Regression

Ruda Zhang (Duke University) walks us through "Gaussian Process Subspace Regression for Model Reduction" by Zhang, Mak, and Dunson.

To keep the topic interesting for both the early career & advanced audience we recap key points at a high level so that no one gets lost.

 

This episode involves a presentation, so you may prefer to watch the YouTube version here: https://youtu.be/IPtqUUG4XcY

 

Ruda's website: https://ruda.city/
The paper: https://arxiv.org/abs/2107.04668

Ruda Zhang | Math-Science Duality

Ruda Zhang | Math-Science Duality

April 13, 2022

Ruda Zhang | Math-Science Duality

Watch it on...
Youtube: https://youtu.be/GoDwen-RGZg
Podbean: https://dataandsciencepodcast.podbean.com/e/ruda-zhang-math-science-duality/

Statistics is thought to reside at the interface of science and mathematics. Ruda Zhang (Duke University) discusses the friction at this interface and the role that both mathematical formalism & observational/data-driven intuition play in scientific discovery. A great topic for anyone interested in statistics' role in scientific discovery.

#datascience #ai #science #mathematics

Topic List
00:00 COMING UP...
2:44 Ruda Zhang's compendium of cool ideas + a Gaussian process PSA
7:08 Is intuition undervalued in scientific research?
10:16 Mathematics vs observational science. Rigor vs intuition.
14:07 Intuition & discovery precedes mathematical rigor
21:58 Mathematics vs empirical science & the complexity of induction
30:24 Abstract thinking & the cost/benefit of discovery
37:25 The efficient frontier / Pareto Front of knowledge
42:55 Pragmatism and competence
50:24 Math /science dualism
1:15:52 AI making scientific discoveries
1:19:15 Statistical & scientific debate

Simon Mak | Integrating Science into Stats Models

Simon Mak | Integrating Science into Stats Models

April 5, 2022

Simon Mak | Integrating Science into Stats Models
#statistics #science #ai

It’s a common dictum that statisticians need to incorporate domain knowledge into their modeling and the interpretation of their results. But how deeply can scientific principles be embedded into statistical models? Prof. Simon Mak (Duke University) is pushing this idea to the limit by integrating fundamental physics, physiology, and biology into both the models and model inference. This includes Simon’s joint work with Profs. David Dunson and Ruda Zhang (also of Duke University).

Scientific reasoning AND stats. What more could we ask for?

Enjoy!

Watch it on....

YouTube: https://youtu.be/bUbZO7R4z40

Podbean: https://dataandsciencepodcast.podbean.com/e/simon-mak-integrating-science-into-stats-models/

 

00:00 - COMING UP….Scientists & Statisticians
02:09 - Introduction - Integrating scientific knowledge into AI/ML
06:08 - How much domain knowledge is sufficient?
09:15 - Choosing which prior knowledge to integrate into a model
14:49 - Black box & gray box optimization
19:50 - Non-physics examples of integrating scientific theory into ML models
22:45 - Scientific principles & modeling at different scales
27:20 - Correlation is one just way of modeling linkage
36:37 - Conditional independence & different-fidelity experiments
39:40 - Innovation vs incorporation of known information in the model
42:52 - Aortic stenosis example
52:49 - Which mathematics can be used to represent scientific knowledge
57:09 - How to acquire scientific domain knowledge
1:02:45 - Complementary approaches to integrating science
1:06:48 - Gaussian process & integrating priors over functions
1:12:48 - A topic for statisticians and scientists to debate:science-based vs data-based learning.

Simon Mak's Webpage: https://sites.google.com/view/simonmak/home

 

Martin Goodson | Practical Data Science & The UK’s AI Roadmap

Martin Goodson | Practical Data Science & The UK’s AI Roadmap

March 16, 2022

Martin Goodson | Practical Data Science & The UK's AI Roadmap

#ai #datascience #startups

Martin Goodson (Evolution AI) describes the key aspects of the UK's AI Roadmap & responses to the document by members of the Royal Statistical Society. In particular, Martin describes the disconnect between the priorities of AI startups and industry practitioners on one side, and government and academia on the other. Martin also outlines which skills early career data scientists should focus on while in school versus after entering the workforce.

Also available on....

YouTube: https://youtu.be/T9qRl6Hclhg

 

Topic List

0:00 COMING UP: Scientific culture & AI

1:25 The UK AI Roadmap

8:44 Who is a data science “practitioner”? 

12:53 Data science in AI startups

20:36 Is there a disconnect between practitioners & academia?

25:09 Key skills for new data science graduates

32:03 Coding & production level data science

39:30 Learning the right data analysis skills at the course-level. 

45:32 AI leadership

58:40 AI from academia & OpenSource initiatives

1:05:37 Large institutions' impact on the AI field

1:08:24 Back to the UK AI roadmap  

1:12:16 Building an AI community 

1:13:15 AI in our lifetime: Moonshots & realistic goals

1:14:31 Scientific debate

Jack Fitzsimons | Data Security, Privacy, & Artificial Intelligence

Jack Fitzsimons | Data Security, Privacy, & Artificial Intelligence

February 28, 2022

Dr. Jack Fitzsimons (Oblivious AI) gives a high-level introduction to the technologies that can either exploit or protect your data privacy. If you'd like to survey the landscape of data privacy-preserving technologies (from someone who's building the tech) this is a good place to start!

#datascience #privacy #ai

 

0:00 - Coming up...
3:24 - Introduction
6:20 - Data privacy and privacy enhancing technologies  
13:00 - History of privacy enhancing technologies
19:54 - Differential privacy: Hiding the influence of a single data point
22:52 - Trading data utility for data privacy
38:32 - Tracking algorithms and how they decide user preferences
42:04 - Preserving privacy: Anonymizing data & VPNs
50:17 - Exploration vs Exploitation: Combining best of multiple domains to tackle problems
54:13 - Federated learning, input and output privacy of data
58:45 - Balancing data privacy vs data-driven personalization
1:05:50 - What should data scientists/statisticians debate?

Podbean App

Play this podcast on Podbean App