Skip to content

๐ŸŽถ Using a Genetic Algorithm (GA) to solve the multi-objective music festival lineup problem: maximizing popularity & diversity while minimizing fan conflicts. Developed for the CIFO course at NOVA IMS.

License

Notifications You must be signed in to change notification settings

Silvestre17/CIFO_MusicFestivalLineup_GeneticAlgorithm_MasterProject

Repository files navigation

๐ŸŽถ Music Festival Lineup Optimization using Genetic Algorithms ๐ŸŽค๐ŸŽธ

GitHub Repo

๐Ÿ“ Description

This project applies a Genetic Algorithm (GA), a powerful metaheuristic, to solve the complex Music Festival Lineup Optimization problem. The challenge is to create an optimal schedule for 35 artists across 5 stages and 7 time slots. An "optimal" lineup is one that successfully balances three conflicting criteria: maximizing the popularity of artists in prime-time slots, maximizing simultaneous genre diversity, and minimizing scheduling conflicts between artists with overlapping fan bases.

โœจ Objective

The main objectives of this project are to:

  • Develop a Genetic Algorithm to find high-quality festival lineups.
  • Maximize the popularity of artists performing in prime time slots (the final slot on each stage).
  • Maximize the diversity of musical genres offered simultaneously across different stages within each time slot.
  • Minimize scheduling conflicts arising from artists with overlapping fan bases performing concurrently.
  • Evaluate different GA configurations (selection, crossover, mutation operators, elitism) to identify the most effective strategies.
  • (Extra) Compare the GA's performance against other optimization heuristics like Hill Climbing and Simulated Annealing.

๐Ÿ“š Project Context

This project was developed for the Computational Intelligence for Optimization (CIFO) course in the Master's in Data Science and Advanced Analytics program at NOVA IMS, during the 2nd Semester of the 2024-2025 academic year.

Dataset Source: The project utilizes two primary CSV files:

  • artists(in).csv: Contains details for each of the 35 artists, including their unique name, primary music genre, and a popularity score (0-100).
  • conflicts(in).csv: A square matrix representing the degree of fan base conflict (0-1) between any two artists if they perform simultaneously.

๐Ÿ› ๏ธ Technology Stack

The entire project, from data handling to algorithm implementation and analysis, was developed in Python.

Python Pandas NumPy Matplotlib Seaborn Plotly SciPy Stats

๐Ÿ—๏ธ Project Workflow & Methodology

The project follows a structured approach, guiding the process from understanding the problem to analyzing the results.

Project Flowchart

Figure 1: Genetic Algorithm Flowchart.

  1. Problem Understanding & Definition: ๐Ÿ’ก
    • Problem: Schedule 35 unique artists into 35 slots (5 stages x 7 time slots) to optimize three conflicting objectives.
    • Objective Function: A weighted sum of normalized scores for prime slot popularity, genre diversity, and conflict minimization (1 - normalized conflict). Weights are 1/3 for each.
    • Constraints: Each artist assigned exactly once; no empty slots; all slots/stages uniform. Invalid solutions (duplicates, unassigned artists) are not permitted in the population.
      • Notebook Section: 1. Problem Definition

Python Pandas NumPy

  1. Data Understanding & Preparation: ๐Ÿ”
    • Datasets: artists(in).csv, conflicts(in).csv.
    • Key Attributes: Artist genre, popularity, pairwise conflict scores.
    • Preprocessing: Loaded data into Pandas DataFrames, converted artists_df to a dictionary (artists_dict) for efficient lookup, and transformed conflicts_df into a NumPy matrix (conflict_matrix) indexed by artist IDs.
      • Notebook Section: ๐Ÿงฎ Import Databases, Data Preprocessing

Matplotlib Seaborn Plotly

  1. Genetic Algorithm Modeling: ๐Ÿงฌ

    • Solution Representation (LineupSolution class): A 2D NumPy array (NUM_STAGES x NUM_SLOTS) where each cell contains a unique artist ID. This directly maps to the festival schedule.

    Genotype Solution

    Code Schema:

    Code Schema

    • Fitness Function: Implemented within LineupSolution, calculating the weighted sum of:
      • Normalized Prime Slot Popularity: _calculate_normalized_popularity()
      • Normalized Genre Diversity: _calculate_normalized_diversity()
      • Normalized Conflict Penalty (used as 1 - penalty): _calculate_normalized_conflict()

    Fitness Function

    • Genetic Operators (New/Adapted):

      • Selection:

        • tournament_selection: Selects the best individual from a random subset.
        • rank_selection: Assigns selection probability based on rank.
      • Crossover:

        • cycle_crossover (CX): Preserves absolute positions from parents.

        Cycle Crossover

        • order_crossover (OX): Preserves relative order from a segment of one parent.

        Order Crossover

      • Mutation:

        • prime_slot_swap_mutation: Swaps an artist in a prime slot with another non-prime artist on the same stage.

        Prime Slot Swap Mutation

        • insert_mutation: Moves an artist from one position to another, shifting others.

        Insert Mutation

        • stage_shuffle_mutation: Shuffles all artists within a randomly selected stage.

        Stage Shuffle Mutation

    • Elitism: Implemented and tested (True/False) to carry over the best individual(s) to the next generation.

      • Notebook Sections: 1. Solution Representation, 2. Fitness Function, 3. Implement New Genetic Operators
      • Code Modules: library/lineupsolution.py, library/genetic_algorithms/ (selection, crossover, mutation modules)
  2. Experimentation & Evaluation: ๐Ÿ“Š

    • Grid Search: Systematically tested 2 selection x 2 crossover x 3 mutation operators x 2 elitism settings (24 configurations total).
    • Parameters: POPULATION_SIZE = 100, MAX_GENERATIONS = 100, XO_PROBABILITY = 0.9, MUT_PROBABILITY = 0.2, TOURNAMENT_SIZE = 50, NUM_RUNS = 30 (for statistical significance).
    • Metrics: Best fitness achieved, median fitness per generation, standard deviation of fitness.
    • Analysis:
      • Identified overall best-performing configurations.
      • Analyzed the impact of different selection, crossover, and mutation methods on convergence and solution quality.
      • Assessed the effect of elitism.
    • Statistical Tests: Kruskal-Wallis and Dunn's post-hoc tests (or Mann-Whitney U for pairwise) to compare configurations and algorithms.
    • Visualization: Evolutionary trend plots (median fitness ยฑ std dev vs. generation), boxplots of final fitness distributions, 3D scatter plot for objective trade-offs.
      • Notebook Sections: 4. Genetic Algorithm Execution (Grid Search), 5. Results Processing and Saving, 7. Performance Analysis and Visualization
  3. Comparative Analysis (Optional): โš”๏ธ

    • Hill Climbing (HC): Implemented and tested as a local search baseline. Explores neighbors by swapping artists within the same stage.

    Hill Climbing

    • Simulated Annealing (SA): Implemented and tested as a probabilistic metaheuristic. Explores neighbors by swapping artists within the same stage, with a temperature-controlled acceptance of worse solutions.

    Simulated Annealing

    • Comparison: Compared the best GA configuration's performance against HC and SA based on final solution quality over multiple runs.
      • Notebook Section: 9. Optional: Comparison with Hill Climbing and Simulated Annealing
  4. Deliverables: ๐Ÿ“ค

    • Code Implementation: This GitHub repository containing all Python scripts and Jupyter Notebooks.
      • Main Notebook: MusicLineupOptimization_CIFOProject_GroupW.ipynb
      • Core Logic: Files within the library/ directory.
    • Report: A PDF document detailing the problem, methodology, experiments, results, and conclusions.
    • Results: Saved experiments in ./Results/ and plots in ./Results/Plots/.

๐Ÿ“ˆ Key Results & Insights

  • Best GA Configuration: The combination of Rank Selection, Cycle Crossover, and Stage Shuffle Mutation (with Elitism enabled) consistently produced the highest median fitness scores over 100 generations.
  • Operator Impact:
    • Crossover: Cycle Crossover (CX) proved superior, generally leading to higher-quality solutions by preserving absolute artist positions from parents.
    • Mutation: Stage Shuffle was a highly effective mutation strategy, promoting diverse exploration of the solution space.
  • Algorithm Comparison: The best GA configurations significantly outperformed Hill Climbing and performed on par with a well-tuned Simulated Annealing, demonstrating the power of GAs in navigating complex, multi-modal search spaces.

๐Ÿ“š Conclusion & Future Work

This project successfully demonstrated the application of Genetic Algorithms to the complex, multi-objective problem of music festival lineup optimization. The implemented GA, with carefully chosen operators and parameters, was able to find high-quality solutions that effectively balance artist popularity, genre diversity, and fan conflicts.

Future Work could explore:

  • More sophisticated multi-objective optimization techniques
  • Adaptive parameter tuning for GA operators.
  • Hybrid approaches combining GA with local search methods.
  • A hyperparameter optimization for Simulated Annealing ($C$, $L$, $T$) to improve its performance.

Feel free to explore the MusicLineupOptimization_CIFOProject_GroupW.ipynb notebook for detailed implementation, experimentation, and analysis!


๐Ÿ“‚ Repository Structure

.
โ”œโ”€โ”€ MusicLineupOptimization_CIFOProject_GroupW.ipynb  # Main Jupyter Notebook with all analysis
โ”œโ”€โ”€ data/
โ”‚   โ””โ”€โ”€ 3_MusicFestivalLineup/
โ”‚       โ”œโ”€โ”€ artists(in).csv
โ”‚       โ””โ”€โ”€ conflicts(in).csv
โ”œโ”€โ”€ library/                                          # Core logic for the GA and other algorithms
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ lineupsolution.py                             # Solution representation and fitness
โ”‚   โ”œโ”€โ”€ genetic_algorithms/
โ”‚   โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”‚   โ”œโ”€โ”€ algorithm.py                              # GA main loop components
โ”‚   โ”‚   โ”œโ”€โ”€ crossover.py                              # Crossover operators
โ”‚   โ”‚   โ”œโ”€โ”€ mutation.py                               # Mutation operators
โ”‚   โ”‚   โ””โ”€โ”€ selection.py                              # Selection mechanisms
โ”‚   โ”œโ”€โ”€ hill_climbing.py
โ”‚   โ””โ”€โ”€ simulated_annealing.py
โ”œโ”€โ”€ Results/
โ”‚   โ”œโ”€โ”€ GA_GridSearchResults_MusicFestivalLineup.csv  # Saved raw results from grid search
โ”‚   โ””โ”€โ”€ Plots/                                        # Saved visualizations
โ”‚       โ”œโ”€โ”€ Top5_ConfigurationsEvolutionaryTrend.png
โ”‚       โ””โ”€โ”€ ... (other generated plots)
โ”œโ”€โ”€ ReportSchemas/                                    # Report schemas and flowcharts
โ””โ”€โ”€ README.md                                         # This file

๐Ÿ‘ฅ Team (Group W)

  • Andrรฉ Silvestre, 20240502
  • Filipa Pereira, 20240509
  • Umeima Mahomed, 20240543

About

๐ŸŽถ Using a Genetic Algorithm (GA) to solve the multi-objective music festival lineup problem: maximizing popularity & diversity while minimizing fan conflicts. Developed for the CIFO course at NOVA IMS.

Topics

Resources

License

Stars

Watchers

Forks

Contributors 3

  •  
  •  
  •