
GitHub - huggingface/datasets: The largest hub of ready-to-use ...
🤗 Datasets is a lightweight library providing two main features: one-line dataloaders for many public datasets: one-liners to download and pre-process any of the major public datasets (image …
Curated open data · GitHub
Relevant open data curated. Curated open data has 154 repositories available. Follow their code on GitHub.
GitHub - ncbi/datasets: NCBI Datasets is a new resource that lets …
NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases. - ncbi/datasets
GitHub - luminati-io/Free-datasets: A collection of multiple free ...
This repository contains a collection of free datasets with thousands of records for use in data analysis, machine learning, and research. The datasets span multiple domains, from business …
A collection of datasets originally distributed in R packages
Rdatasets is a collection of 3499 datasets which were originally distributed alongside the statistical software environment R and some of its add-on packages. The goal is to make …
Datasets For Recommender Systems - GitHub
All of these recommendation datasets can convert to the atomic files defined in RecBole, which is a unified, comprehensive and efficient recommendation library. After converting to the atomic …
Releases · huggingface/datasets - GitHub
🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools - huggingface/datasets
TensorFlow Datasets - GitHub
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ... - tensorflow/datasets
List of Machine Learning Datasets - GitHub
List of Machine Learning Datasets The following is a list of publicly availables datasets for various machine learning tasks. Reviews, fixes, dead links and updates are appreciated. Please …
Toolkit for linearizing PDFs for LLM datasets/training
Toolkit for linearizing PDFs for LLM datasets/training - allenai/olmocr