Recommendation system for english books [dataset goodreads]

Type used: Content-based RS

Team recommender systems 10

Antonio Gagliarducci and Lukas Nackmayr

Instructions

Use python 3.12

First run data_preparation.ipynb to prepare the dataset (clean and process it) 2. (note it will take around 20 minutes due to the large dataset)
Then run recommender_system.ipynb inside it you will find the one explained in class and some extra recommender system with graphs

Datasets:

GoodReads_100k_books.csv original dataset
goodreads_with_languages.csv original dataset with the addition of the languages used in each book (processed by utils/lang_detect.py)
cleaned_data.csv dataset after being cleaned and processed by data_preparations.ipynb

Dataset's source:

Kaggle.com

Utils folder

Inside here you will find some helpful scripts to prepare the dataset:

lang_dect.py will look at title and description and find out the language used in all rows, process the original dataset and save it in goodreads_with_languages.csv.
- You will have also some stats about the languages found (note you don't have to run it separately, everything is called from the jupiter notebook)

Total unique language categories: 37

Language Detection Breakdown:
Total books: 100000
Books with content: 99713
Missing content: 1
Too short content: 272
Language detection failures: 14
Unexpected errors: 0

nan.py prints out why the lang_detect script has detected certain results for some books.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
utils		utils
.gitignore		.gitignore
README.md		README.md
data_preparation.ipynb		data_preparation.ipynb
recommender_system.ipynb		recommender_system.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recommendation system for english books [dataset goodreads]

Type used: Content-based RS

Team recommender systems 10

Antonio Gagliarducci and Lukas Nackmayr

Instructions

Datasets:

Dataset's source:

Utils folder

License

About

Releases

Packages

Languages

MysterHawk/kdg-dai6-reccomandation-system

Folders and files

Latest commit

History

Repository files navigation

Recommendation system for english books [dataset goodreads]

Type used: Content-based RS

Team recommender systems 10

Antonio Gagliarducci and Lukas Nackmayr

Instructions

Datasets:

Dataset's source:

Utils folder

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages