This project applies a Genetic Algorithm (GA), a powerful metaheuristic, to solve the complex Music Festival Lineup Optimization problem. The challenge is to create an optimal schedule for 35 artists across 5 stages and 7 time slots. An "optimal" lineup is one that successfully balances three conflicting criteria: maximizing the popularity of artists in prime-time slots, maximizing simultaneous genre diversity, and minimizing scheduling conflicts between artists with overlapping fan bases.
The main objectives of this project are to:
- Develop a Genetic Algorithm to find high-quality festival lineups.
- Maximize the popularity of artists performing in prime time slots (the final slot on each stage).
- Maximize the diversity of musical genres offered simultaneously across different stages within each time slot.
- Minimize scheduling conflicts arising from artists with overlapping fan bases performing concurrently.
- Evaluate different GA configurations (selection, crossover, mutation operators, elitism) to identify the most effective strategies.
- (Extra) Compare the GA's performance against other optimization heuristics like Hill Climbing and Simulated Annealing.
This project was developed for the Computational Intelligence for Optimization (CIFO) course in the Master's in Data Science and Advanced Analytics program at NOVA IMS, during the 2nd Semester of the 2024-2025 academic year.
Dataset Source: The project utilizes two primary CSV files:
artists(in).csv: Contains details for each of the 35 artists, including their unique name, primary music genre, and a popularity score (0-100).conflicts(in).csv: A square matrix representing the degree of fan base conflict (0-1) between any two artists if they perform simultaneously.
The entire project, from data handling to algorithm implementation and analysis, was developed in Python.
The project follows a structured approach, guiding the process from understanding the problem to analyzing the results.
Figure 1: Genetic Algorithm Flowchart.
- Problem Understanding & Definition: ๐ก
- Problem: Schedule 35 unique artists into 35 slots (5 stages x 7 time slots) to optimize three conflicting objectives.
- Objective Function: A weighted sum of normalized scores for prime slot popularity, genre diversity, and conflict minimization (1 - normalized conflict). Weights are 1/3 for each.
- Constraints: Each artist assigned exactly once; no empty slots; all slots/stages uniform. Invalid solutions (duplicates, unassigned artists) are not permitted in the population.
- Notebook Section:
1. Problem Definition
- Notebook Section:
- Data Understanding & Preparation: ๐
- Datasets:
artists(in).csv,conflicts(in).csv. - Key Attributes: Artist genre, popularity, pairwise conflict scores.
- Preprocessing: Loaded data into Pandas DataFrames, converted
artists_dfto a dictionary (artists_dict) for efficient lookup, and transformedconflicts_dfinto a NumPy matrix (conflict_matrix) indexed by artist IDs.- Notebook Section:
๐งฎ Import Databases,Data Preprocessing
- Notebook Section:
- Datasets:
-
Genetic Algorithm Modeling: ๐งฌ
- Solution Representation (
LineupSolutionclass): A 2D NumPy array (NUM_STAGES x NUM_SLOTS) where each cell contains a unique artist ID. This directly maps to the festival schedule.
Code Schema:
- Fitness Function: Implemented within
LineupSolution, calculating the weighted sum of:- Normalized Prime Slot Popularity:
_calculate_normalized_popularity() - Normalized Genre Diversity:
_calculate_normalized_diversity() - Normalized Conflict Penalty (used as 1 - penalty):
_calculate_normalized_conflict()
- Normalized Prime Slot Popularity:
-
Genetic Operators (New/Adapted):
-
Selection:
tournament_selection: Selects the best individual from a random subset.rank_selection: Assigns selection probability based on rank.
-
Crossover:
cycle_crossover (CX): Preserves absolute positions from parents.
order_crossover (OX): Preserves relative order from a segment of one parent.
-
Mutation:
prime_slot_swap_mutation: Swaps an artist in a prime slot with another non-prime artist on the same stage.
insert_mutation: Moves an artist from one position to another, shifting others.
stage_shuffle_mutation: Shuffles all artists within a randomly selected stage.
-
-
Elitism: Implemented and tested (True/False) to carry over the best individual(s) to the next generation.
- Notebook Sections:
1. Solution Representation,2. Fitness Function,3. Implement New Genetic Operators - Code Modules:
library/lineupsolution.py,library/genetic_algorithms/(selection, crossover, mutation modules)
- Notebook Sections:
- Solution Representation (
-
Experimentation & Evaluation: ๐
- Grid Search: Systematically tested 2 selection x 2 crossover x 3 mutation operators x 2 elitism settings (24 configurations total).
- Parameters:
POPULATION_SIZE = 100,MAX_GENERATIONS = 100,XO_PROBABILITY = 0.9,MUT_PROBABILITY = 0.2,TOURNAMENT_SIZE = 50,NUM_RUNS = 30(for statistical significance). - Metrics: Best fitness achieved, median fitness per generation, standard deviation of fitness.
- Analysis:
- Identified overall best-performing configurations.
- Analyzed the impact of different selection, crossover, and mutation methods on convergence and solution quality.
- Assessed the effect of elitism.
- Statistical Tests: Kruskal-Wallis and Dunn's post-hoc tests (or Mann-Whitney U for pairwise) to compare configurations and algorithms.
- Visualization: Evolutionary trend plots (median fitness ยฑ std dev vs. generation), boxplots of final fitness distributions, 3D scatter plot for objective trade-offs.
- Notebook Sections:
4. Genetic Algorithm Execution (Grid Search),5. Results Processing and Saving,7. Performance Analysis and Visualization
- Notebook Sections:
-
Comparative Analysis (Optional): โ๏ธ
- Hill Climbing (HC): Implemented and tested as a local search baseline. Explores neighbors by swapping artists within the same stage.
- Simulated Annealing (SA): Implemented and tested as a probabilistic metaheuristic. Explores neighbors by swapping artists within the same stage, with a temperature-controlled acceptance of worse solutions.
- Comparison: Compared the best GA configuration's performance against HC and SA based on final solution quality over multiple runs.
- Notebook Section:
9. Optional: Comparison with Hill Climbing and Simulated Annealing
- Notebook Section:
-
Deliverables: ๐ค
- Code Implementation: This GitHub repository containing all Python scripts and Jupyter Notebooks.
- Main Notebook:
MusicLineupOptimization_CIFOProject_GroupW.ipynb - Core Logic: Files within the
library/directory.
- Main Notebook:
- Report: A PDF document detailing the problem, methodology, experiments, results, and conclusions.
- Results: Saved experiments in
./Results/and plots in./Results/Plots/.
- Code Implementation: This GitHub repository containing all Python scripts and Jupyter Notebooks.
- Best GA Configuration: The combination of Rank Selection, Cycle Crossover, and Stage Shuffle Mutation (with Elitism enabled) consistently produced the highest median fitness scores over 100 generations.
- Operator Impact:
- Crossover:
Cycle Crossover (CX)proved superior, generally leading to higher-quality solutions by preserving absolute artist positions from parents. - Mutation:
Stage Shufflewas a highly effective mutation strategy, promoting diverse exploration of the solution space.
- Crossover:
- Algorithm Comparison: The best GA configurations significantly outperformed Hill Climbing and performed on par with a well-tuned Simulated Annealing, demonstrating the power of GAs in navigating complex, multi-modal search spaces.
This project successfully demonstrated the application of Genetic Algorithms to the complex, multi-objective problem of music festival lineup optimization. The implemented GA, with carefully chosen operators and parameters, was able to find high-quality solutions that effectively balance artist popularity, genre diversity, and fan conflicts.
Future Work could explore:
- More sophisticated multi-objective optimization techniques
- Adaptive parameter tuning for GA operators.
- Hybrid approaches combining GA with local search methods.
- A hyperparameter optimization for Simulated Annealing (
$C$ ,$L$ ,$T$ ) to improve its performance.
Feel free to explore the MusicLineupOptimization_CIFOProject_GroupW.ipynb notebook for detailed implementation, experimentation, and analysis!
.
โโโ MusicLineupOptimization_CIFOProject_GroupW.ipynb # Main Jupyter Notebook with all analysis
โโโ data/
โ โโโ 3_MusicFestivalLineup/
โ โโโ artists(in).csv
โ โโโ conflicts(in).csv
โโโ library/ # Core logic for the GA and other algorithms
โ โโโ __init__.py
โ โโโ lineupsolution.py # Solution representation and fitness
โ โโโ genetic_algorithms/
โ โ โโโ __init__.py
โ โ โโโ algorithm.py # GA main loop components
โ โ โโโ crossover.py # Crossover operators
โ โ โโโ mutation.py # Mutation operators
โ โ โโโ selection.py # Selection mechanisms
โ โโโ hill_climbing.py
โ โโโ simulated_annealing.py
โโโ Results/
โ โโโ GA_GridSearchResults_MusicFestivalLineup.csv # Saved raw results from grid search
โ โโโ Plots/ # Saved visualizations
โ โโโ Top5_ConfigurationsEvolutionaryTrend.png
โ โโโ ... (other generated plots)
โโโ ReportSchemas/ # Report schemas and flowcharts
โโโ README.md # This file
- Andrรฉ Silvestre, 20240502
- Filipa Pereira, 20240509
- Umeima Mahomed, 20240543










