pyvideo · ELC · Jun 19, 2025
diff --git a/pydata-london-2022/category.json b/pydata-london-2022/category.json
@@ -0,0 +1,3 @@
+{
+  "title": "PyData London 2022"
+}
diff --git a/pydata-london-2022/videos/ade-idowu-document-sentence-similarity-solution.json b/pydata-london-2022/videos/ade-idowu-document-sentence-similarity-solution.json
@@ -0,0 +1,47 @@
+{
+  "description": "Ade Idowu Presents:\n\nDocument/Sentence Similarity Solution Using Open Source NLP Libraries, Frameworks and Datasets\n\nThe need to develop robust document/text similarity measure solutions is an essential step for building applications such as Recommendation Systems, Search Engines, Information Retrieval Systems including other ML/AI applications such as News Aggregators or Automated Recruitment systems used to match CVs to job specification and so on. In general, text similarity is the measure of how words/tokens, tweets, phrases, sentences, paragraphs and entire documents are lexically and\u202fsemantically close to each other. Texts/words are lexically similar if\u202fthey\u202fhave similar character sequence or structure and, are semantically similar if they have the same meaning, describe similar concepts and they are used in the same context.\u202f\u202f\n\nThis tutorial will demonstrate a number of strategies for feature extraction i.e., transforming documents to numeric feature vectors. This transformation step is a prerequisite for computing the similarity between documents. Typically, each strategy will involve 4 steps, namely: 1) the use of standard natural language pre-processing techniques to prepare/clean the documents, 2) the transformation of the document text into numeric vectors/embeddings, 3) calculation of document similarity using metrics such as Cosine, Euclidean and Jaccard and, 4) validation of the findings\n\nGithub Repo: https://github.com/aidowu1/Ades-NLP-Recepies/tree/master/Exploration%20of%20Document%20Similarity%20Models\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...\"",
+  "duration": 5302,
+  "language": "eng",
+  "recorded": "2022-06-17",
+  "related_urls": [
+    {
+      "label": "Conference Website",
+      "url": "https://pydata.org/london2022/"
+    },
+    {
+      "label": "https://github.com/aidowu1/Ades-NLP-Recepies/tree/master/Exploration%20of%20Document%20Similarity%20Models",
+      "url": "https://github.com/aidowu1/Ades-NLP-Recepies/tree/master/Exploration%20of%20Document%20Similarity%20Models"
+    },
+    {
+      "label": "https://github.com/numfocus/YouTubeVi...",
+      "url": "https://github.com/numfocus/YouTubeVi..."
+    }
+  ],
+  "speakers": [
+    "TODO"
+  ],
+  "tags": [
+    "Education",
+    "Julia",
+    "NumFOCUS",
+    "Opensource",
+    "PyData",
+    "Python",
+    "Tutorial",
+    "coding",
+    "how to program",
+    "learn",
+    "learn to code",
+    "python 3",
+    "scientific programming",
+    "software"
+  ],
+  "thumbnail_url": "https://i.ytimg.com/vi/qXcRW5fIa1g/maxresdefault.jpg",
+  "title": "Ade Idowu - Document/Sentence Similarity Solution",
+  "videos": [
+    {
+      "type": "youtube",
+      "url": "https://www.youtube.com/watch?v=qXcRW5fIa1g"
+    }
+  ]
+}
diff --git a/...s/adrin-jalali-questions-and-practices-to-make-algorithmic-decision-making-more-fair.json b/...s/adrin-jalali-questions-and-practices-to-make-algorithmic-decision-making-more-fair.json
@@ -0,0 +1,43 @@
+{
+  "description": "Adrin Jalali Presents:\n\nMeasurement and Fairness: Questions and Practices to Make Algorithmic Decision Making More Fair\n\nMachine learning is almost always used in systems which automate or semi-automate decision making processes. These decisions are used in recommender systems, fraud detection, healthcare recommendation systems, etc. Many systems, if not most, can induce harm by giving a less desirable outcome for cases where they should in fact give a more desired outcome, e.g. reporting an insurance claim to be fraud when indeed it is not.\n\nIn this talk we first go through different sources of harm which can creep into a system based on machine learning, and the types of harm an ML based system can induce.\n\nTaking lessons from social sciences, one can see input and output values of automated systems as measurements of constructs or a proxy measurement of those constructs. In this talk we go through a set of questions one should ask before and while working on such systems. Some of these questions can be answered quantitatively, and others qualitatively.\n\n\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n00:10 Help us add time stamps or captions to this video! See the description for details.\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps",
+  "duration": 2418,
+  "language": "eng",
+  "recorded": "2022-06-17",
+  "related_urls": [
+    {
+      "label": "Conference Website",
+      "url": "https://pydata.org/london2022/"
+    },
+    {
+      "label": "https://github.com/numfocus/YouTubeVideoTimestamps",
+      "url": "https://github.com/numfocus/YouTubeVideoTimestamps"
+    }
+  ],
+  "speakers": [
+    "TODO"
+  ],
+  "tags": [
+    "Education",
+    "Julia",
+    "NumFOCUS",
+    "Opensource",
+    "PyData",
+    "Python",
+    "Tutorial",
+    "coding",
+    "how to program",
+    "learn",
+    "learn to code",
+    "python 3",
+    "scientific programming",
+    "software"
+  ],
+  "thumbnail_url": "https://i.ytimg.com/vi/9uLDyK8jKYc/maxresdefault.jpg",
+  "title": "Adrin Jalali - Questions and Practices to Make Algorithmic Decision Making More Fair",
+  "videos": [
+    {
+      "type": "youtube",
+      "url": "https://www.youtube.com/watch?v=9uLDyK8jKYc"
+    }
+  ]
+}
diff --git a/...met-melek-what-is-x-up-to-ner-and-relationship-extraction-for-information-extraction.json b/...met-melek-what-is-x-up-to-ner-and-relationship-extraction-for-information-extraction.json
@@ -0,0 +1,51 @@
+{
+  "description": "Ahmet Melek Presents:\n\nWhat is X up to? - NER and Relationship Extraction for Information Extraction\n\nDealing with unstructured text to obtain information is one of the biggest aims in the field of natural language processing. In this talk, we will be demoing a solution where we have unstructured text on a particular topic, and we apply named entity recognition, together with relationship extraction, to extract structured data. We will be introducing our data source, the models that we use, and will be inspecting the end results, viewing particular statistics, and hovering over a graph, extracted from the raw text.\n\nGithub: https://github.com/ahmetmeleq/PyData2022_NER_RelEx\nSlides: https://pydata.org/london2022/wp-content/uploads/2022/07/What-is-X-up-to_.pdf\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...",
+  "duration": 1912,
+  "language": "eng",
+  "recorded": "2022-06-17",
+  "related_urls": [
+    {
+      "label": "Conference Website",
+      "url": "https://pydata.org/london2022/"
+    },
+    {
+      "label": "https://github.com/ahmetmeleq/PyData2022_NER_RelEx",
+      "url": "https://github.com/ahmetmeleq/PyData2022_NER_RelEx"
+    },
+    {
+      "label": "https://pydata.org/london2022/wp-content/uploads/2022/07/What-is-X-up-to_.pdf",
+      "url": "https://pydata.org/london2022/wp-content/uploads/2022/07/What-is-X-up-to_.pdf"
+    },
+    {
+      "label": "https://github.com/numfocus/YouTubeVi...",
+      "url": "https://github.com/numfocus/YouTubeVi..."
+    }
+  ],
+  "speakers": [
+    "TODO"
+  ],
+  "tags": [
+    "Education",
+    "Julia",
+    "NumFOCUS",
+    "Opensource",
+    "PyData",
+    "Python",
+    "Tutorial",
+    "coding",
+    "how to program",
+    "learn",
+    "learn to code",
+    "python 3",
+    "scientific programming",
+    "software"
+  ],
+  "thumbnail_url": "https://i.ytimg.com/vi/nO59pdwWELA/maxresdefault.jpg",
+  "title": "Ahmet Melek - What is X up to? - NER and Relationship Extraction for Information Extraction",
+  "videos": [
+    {
+      "type": "youtube",
+      "url": "https://www.youtube.com/watch?v=nO59pdwWELA"
+    }
+  ]
+}
diff --git a/...o-saucedo-accelerating-machine-learning-at-scale-with-huggingface-optimum-and-seldon.json b/...o-saucedo-accelerating-machine-learning-at-scale-with-huggingface-optimum-and-seldon.json
@@ -0,0 +1,43 @@
+{
+  "description": "Alejandro Saucedo Presents:\n\nAccelerating High-Performance Machine Learning with HuggingFace, Optimum & Seldon\n\nIdentifying the right tools for high performance production machine learning may be overwhelming as the ecosystem continues to grow at break-neck speed. In this session showcase how practitioners can productionise ML models in scalable ecosystems in an optimizable way without having to deal with the underlying infrastructure challenges. Saucedo takes a GPT-2 HuggingFace model, optimizing it with ONNX and deploying to MLServer at scale using Seldon.\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. \n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...",
+  "duration": 2188,
+  "language": "eng",
+  "recorded": "2022-06-17",
+  "related_urls": [
+    {
+      "label": "Conference Website",
+      "url": "https://pydata.org/london2022/"
+    },
+    {
+      "label": "https://github.com/numfocus/YouTubeVi...",
+      "url": "https://github.com/numfocus/YouTubeVi..."
+    }
+  ],
+  "speakers": [
+    "TODO"
+  ],
+  "tags": [
+    "Education",
+    "Julia",
+    "NumFOCUS",
+    "Opensource",
+    "PyData",
+    "Python",
+    "Tutorial",
+    "coding",
+    "how to program",
+    "learn",
+    "learn to code",
+    "python 3",
+    "scientific programming",
+    "software"
+  ],
+  "thumbnail_url": "https://i.ytimg.com/vi/BQ8NrdkiE44/maxresdefault.jpg",
+  "title": "Alejandro Saucedo - Accelerating Machine Learning at Scale with HuggingFace, Optimum and Seldon",
+  "videos": [
+    {
+      "type": "youtube",
+      "url": "https://www.youtube.com/watch?v=BQ8NrdkiE44"
+    }
+  ]
+}
diff --git a/...der-hendorf-lessons-learned-about-data-ai-at-enterprises-and-smes-pydata-london-2022.json b/...der-hendorf-lessons-learned-about-data-ai-at-enterprises-and-smes-pydata-london-2022.json
@@ -0,0 +1,43 @@
+{
+  "description": "Alexander Hendorf presents:\n\nLessons Learned About Data & AI at Enterprises and SMEs\n\nAll one needs is strategy, skill and resources to make digitalization and AI happen. So why is everything taking so long? Shouldn\u2019t you all be finished yesterday already? An honest talk about how to address the complexity of making data and AI happen in enterprises.\n\nMany incumbents are transitioning to new technologies while their businesses operate on systems that are years or decades old. Introducing new technologies is not just about introducing Open Source or introducing community culture or working agile or SCRUM or explaining complicated technology stuff to executives. The truth is: it requires all of it and likely even more. Mastering innovation requires having many balls in the air at once.\n\nIn this talk I'll present a transformation use case of an established player including our best practices and anti-patterns.\n\nWe will discuss the following aspects:\n- From idea to strategy\n- Assessing the status quo\n- Introducing Python and Open Source and what to use (or not)\n- Legacy is in the the house, still\n- Getting all departments on the same page\n- Introducing a community-driven collaborative culture\n\nSlides: https://pydata.org/london2022/wp-content/uploads/2022/07/LESSONS-LEARNED-ABOUT-DATA-AI-AT-ENTERPRISES-AND-SMES-PyData-London-22.pdf\n\nwww.pydata.org \n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.",
+  "duration": 2321,
+  "language": "eng",
+  "recorded": "2022-06-17",
+  "related_urls": [
+    {
+      "label": "Conference Website",
+      "url": "https://pydata.org/london2022/"
+    },
+    {
+      "label": "https://pydata.org/london2022/wp-content/uploads/2022/07/LESSONS-LEARNED-ABOUT-DATA-AI-AT-ENTERPRISES-AND-SMES-PyData-London-22.pdf",
+      "url": "https://pydata.org/london2022/wp-content/uploads/2022/07/LESSONS-LEARNED-ABOUT-DATA-AI-AT-ENTERPRISES-AND-SMES-PyData-London-22.pdf"
+    }
+  ],
+  "speakers": [
+    "TODO"
+  ],
+  "tags": [
+    "Education",
+    "Julia",
+    "NumFOCUS",
+    "Opensource",
+    "PyData",
+    "Python",
+    "Tutorial",
+    "coding",
+    "how to program",
+    "learn",
+    "learn to code",
+    "python 3",
+    "scientific programming",
+    "software"
+  ],
+  "thumbnail_url": "https://i.ytimg.com/vi/Bp3pUSZ6DpU/maxresdefault.jpg",
+  "title": "Alexander Hendorf - Lessons Learned About Data & AI at Enterprises and SMEs | PyData London 2022",
+  "videos": [
+    {
+      "type": "youtube",
+      "url": "https://www.youtube.com/watch?v=Bp3pUSZ6DpU"
+    }
+  ]
+}
diff --git a/...ideos/anders-bogsnes-sqlalchemy-and-you-making-sql-the-best-thing-since-sliced-bread.json b/...ideos/anders-bogsnes-sqlalchemy-and-you-making-sql-the-best-thing-since-sliced-bread.json
@@ -0,0 +1,47 @@
+{
+  "description": "Anders Bogsnes presents: \n\nSQLAlchemy and you - making SQL the best thing since sliced bread\n\nAre you writing SQL strings in your code? Have you only used ORMs and want to start getting more control over your SQL?\n\nSQLAlchemy is the gold-standard for working with SQL in Python and this tutorial will get you comfortable with working in it so you can take advantage of its power. We will go through Core and ORM abstractions so you'll be comfortable navigating through the different layers and be able to fully use the power of Python when writing your SQL \n\nGithub Repo: https://github.com/andersbogsnes/pydata-london-2022-sqlalchemy-tutorial\n\nwww.pydata.org\n\nPyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.\n\nPyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.\n\n00:00 Welcome!\n\nWant to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...",
+  "duration": 5302,
+  "language": "eng",
+  "recorded": "2022-06-17",
+  "related_urls": [
+    {
+      "label": "Conference Website",
+      "url": "https://pydata.org/london2022/"
+    },
+    {
+      "label": "https://github.com/andersbogsnes/pydata-london-2022-sqlalchemy-tutorial",
+      "url": "https://github.com/andersbogsnes/pydata-london-2022-sqlalchemy-tutorial"
+    },
+    {
+      "label": "https://github.com/numfocus/YouTubeVi...",
+      "url": "https://github.com/numfocus/YouTubeVi..."
+    }
+  ],
+  "speakers": [
+    "TODO"
+  ],
+  "tags": [
+    "Education",
+    "Julia",
+    "NumFOCUS",
+    "Opensource",
+    "PyData",
+    "Python",
+    "Tutorial",
+    "coding",
+    "how to program",
+    "learn",
+    "learn to code",
+    "python 3",
+    "scientific programming",
+    "software"
+  ],
+  "thumbnail_url": "https://i.ytimg.com/vi/X4-hu3vZAOg/maxresdefault.jpg",
+  "title": "Anders Bogsnes - SQLAlchemy and You - Making SQL the Best Thing Since Sliced Bread",
+  "videos": [
+    {
+      "type": "youtube",
+      "url": "https://www.youtube.com/watch?v=X4-hu3vZAOg"
+    }
+  ]
+}