Initial commit

- Python package to fetch song lyrics from paroles.net
- Web scraping functionality with requests and BeautifulSoup4
- Command-line interface for easy usage
- Comprehensive test suite with pytest
- GitLab CI configuration with uv support
- Package metadata and dependencies in pyproject.toml
- Documentation and usage instructions
This commit is contained in:
2025-08-11 14:21:12 +02:00
commit af71f5e80a
11 changed files with 989 additions and 0 deletions
+98
View File
@@ -0,0 +1,98 @@
# Paroles.net Scraper
A Python package to fetch song lyrics from [paroles.net](https://www.paroles.net/).
## Features
- Fetches song lyrics from paroles.net
- Cleans up advertisement content from lyrics
- Handles URL construction for different artists and songs
- Command-line interface for easy usage
- Comprehensive test suite
- Installable Python package
## Installation
1. Clone or download this repository
2. Install the package in development mode:
```bash
pip install -e .
```
Or if you're using uv:
```bash
uv sync
```
## Usage
### Command Line Interface
After installation, you can use the command line interface:
```bash
paroles-scraper "Artist Name" "Song Title"
```
### Examples
```bash
paroles-scraper "Ed Sheeran" "Shape of You"
paroles-scraper "Imagine Dragons" "Believer"
```
### As a Python Package
You can also use the package directly in your Python code:
```python
from paroles_net_scraper import get_song_lyrics
lyrics = get_song_lyrics("Ed Sheeran", "Shape of You")
print(lyrics)
```
## Testing
The project includes a comprehensive test suite using pytest. To run the tests:
```bash
pytest tests/ -v
```
Or if you're using the virtual environment:
```bash
source .venv/bin/activate
pytest tests/ -v
```
## CI/CD
This project includes a GitLab CI configuration (`.gitlab-ci.yml`) that:
- Runs tests on multiple Python versions
- Builds the package using uv
- Can deploy to PyPI (when configured with credentials)
To use the GitLab CI pipeline:
1. Push your code to a GitLab repository
2. Ensure your GitLab runner is configured
3. Set up PyPI credentials as CI/CD variables if you want to deploy
## How it works
The package constructs a URL based on the artist name and song title, then scrapes the paroles.net website to extract the lyrics. It uses BeautifulSoup to parse the HTML and extract only the relevant text content while filtering out advertisements and other unwanted content.
## Disclaimer
This package is for educational purposes only. Please respect the terms of service of paroles.net and use this package responsibly. Consider the legal and ethical implications of web scraping before using this tool.
## Dependencies
- Python 3.7+
- requests
- beautifulsoup4
- pytest (for running tests)
- uv (for dependency management and packaging)