Skip to content

'n_periods' does not respect dropped link ratios #604

@andrewwilson201

Description

@andrewwilson201

Are you on the latest chainladder version?

  • Yes, this bug occurs on the latest version.

Describe the bug in words

Description

Currently, when using the Development estimator, the n_periods parameter is applied before any exclusions from drop, drop_valuation, etc., are considered. This leads to unexpected results when attempting to exclude specific origin-development combinations from LDF calculations.

Current Behavior

In the example below when n_periods=1 is specified along with drop=[("2009", 12)], the system:

  1. Identifies the most recent period (2009)
  2. Applies the drop, resulting in no valid periods for the calculation
  3. Returns NaN or uses an incorrect period

Expected Behavior

When n_periods=1 is specified along with drop=[("2009", 12)], the system should:

  1. Identify which periods are valid (non-dropped)
  2. Select the most recent valid period (skipping 2009)
  3. Use that period (e.g., 2008) for the LDF calculation

How can the bug be reproduced?

import chainladder as cl
import pandas as pd

# create example triangle
data = {
    'origin': ["2007-01-01", "2007-01-01", "2007-01-01", "2007-01-01", "2008-01-01", "2008-01-01", "2008-01-01", "2009-01-01", "2009-01-01", "2010-01-01"],
    'development': ["2007-01-01", "2008-01-01", "2009-01-01", "2010-01-01", "2008-01-01", "2009-01-01", "2010-01-01", "2009-01-01", "2010-01-01", "2010-01-01"],
    'loss': [100, 200, 300, 400, 150, 300, 450, 200, 250, 50]
}

df = pd.DataFrame(data)
tri = cl.Triangle(
    df, 
    origin='origin',
    development='development',
    columns='loss',
    cumulative=True
)

# calculate ldf with n_periods=1 and the most recent period (2009) dropped
dev = cl.Development(n_periods=1, drop=[('2009', 12)]).fit(tri)
dev.ldf_

# current behaviour : uses only the excluded ratio from 2009 in dev month 12 resulting in NaN
# desired behaviour: use valid ratios (that haven't been dropped) resulting in the ratio from 2008 in dev month 12 and so expected result is 300 / 150 = 2

What is the expected behavior?

The n_periods parameter should be applied after all drop logic. It should select the n most recent link ratios from the set of all valid, non-dropped, non-NaN data points. This ensures that the n_periods argument is always honored when enough valid data is available.

Metadata

Metadata

Labels

Effort > Brief 🐇Small tasks expected to take a few hours up to a couple of days.Great First Contribution! 🌱Beginner friendly tickets with narrow scope and huge impact. Perfect to join our community!Impact > Minor 🔷Small, backward compatible change. Treat like a patch release (e.g., 0.5.8 → 0.5.9).

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions