-
Notifications
You must be signed in to change notification settings - Fork 10
Add ACA marketplace bronze-selection target ETL #618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
daphnehanse11
wants to merge
15
commits into
main
Choose a base branch
from
codex/aca-marketplace-plan-selection
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
7d52739
Add ACA marketplace plan selection proxies
daphnehanse11 68beaf4
Format marketplace plan selection files
daphnehanse11 bb4ab99
Add marketplace fallback premium data
daphnehanse11 3d1fa04
Add marketplace target ETL and validator scaffold
daphnehanse11 63b4eae
Fix marketplace plan selection lint
daphnehanse11 792e4d7
Merge remote-tracking branch 'origin/main' into codex/aca-marketplace…
daphnehanse11 7e2be50
Use selected-plan ratio for ACA bronze targets
daphnehanse11 06799fe
Add changelog fragment for marketplace proxy work
daphnehanse11 7bb5a95
Rename changelog fragment for Towncrier
daphnehanse11 b1ee088
Refocus ACA marketplace PR on ETL targets
daphnehanse11 4583212
Restore numpy import in ACA target test
daphnehanse11 c11dccd
Merge remote-tracking branch 'origin/main' into codex/pr-618-review
daphnehanse11 d22092a
Clarify ACA marketplace ETL inputs
daphnehanse11 bb6475a
Update policyengine-us lock for ACA marketplace ETL
daphnehanse11 eb1d7bf
Add ACA marketplace targets to calibration config
daphnehanse11 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| Add an ACA marketplace ETL that loads state-level HC.gov bronze-plan | ||
| selection targets for APTC recipients into the calibration database. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,223 @@ | ||
| from __future__ import annotations | ||
|
|
||
| import logging | ||
| from pathlib import Path | ||
|
|
||
| import pandas as pd | ||
| from sqlmodel import Session, create_engine | ||
|
|
||
| from policyengine_us_data.calibration.calibration_utils import STATE_CODES | ||
| from policyengine_us_data.db.create_database_tables import ( | ||
| Stratum, | ||
| StratumConstraint, | ||
| Target, | ||
| ) | ||
| from policyengine_us_data.storage import CALIBRATION_FOLDER, STORAGE_FOLDER | ||
| from policyengine_us_data.utils.db import etl_argparser, get_geographic_strata | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| # `selected_marketplace_plan_benchmark_ratio == 1.0` represents benchmark | ||
| # silver coverage, so bronze plan selections are the subset below this ratio. | ||
| BENCHMARK_SILVER_RATIO = 1.0 | ||
|
|
||
| STATE_METAL_SELECTION_PATH = ( | ||
|
baogorek marked this conversation as resolved.
|
||
| CALIBRATION_FOLDER / "aca_marketplace_state_metal_selection_2024.csv" | ||
| ) | ||
|
|
||
| STATE_ABBR_TO_FIPS = {abbr: fips for fips, abbr in STATE_CODES.items()} | ||
|
|
||
|
|
||
| def _extra_args(parser) -> None: | ||
| parser.add_argument( | ||
| "--state-metal-csv", | ||
| type=Path, | ||
| default=STATE_METAL_SELECTION_PATH, | ||
| help=("State-metal CMS OEP proxy CSV. Default: %(default)s"), | ||
| ) | ||
|
|
||
|
|
||
| def extract_aca_marketplace_state_metal_data( | ||
| state_metal_csv_path: Path, | ||
| ) -> pd.DataFrame: | ||
| """Extract CMS marketplace state metal-status inputs from the checked-in CSV. | ||
|
|
||
| This ETL keeps an explicit extract step even though the source file already | ||
| lives in the repository. The original CMS 2024 OEP state metal status PUF | ||
| is not currently pulled from a stable direct-download endpoint in CI, so we | ||
| store the normalized input CSV at | ||
| `policyengine_us_data/storage/calibration_targets/aca_marketplace_state_metal_selection_2024.csv`. | ||
|
|
||
| To reproduce or update that file: | ||
| 1. Download the CMS 2024 OEP state metal status public use file. | ||
| 2. Preserve one row per state/platform/metal/enrollment-status combination. | ||
| 3. Keep the `state_code`, `platform`, `metal_level`, | ||
| `enrollment_status`, `consumers`, and `aptc_consumers` columns. | ||
| 4. Save the normalized output back to `state_metal_csv_path`. | ||
| """ | ||
| return pd.read_csv(state_metal_csv_path) | ||
|
|
||
|
|
||
| def build_state_marketplace_bronze_aptc_targets( | ||
| state_metal_df: pd.DataFrame, | ||
| ) -> pd.DataFrame: | ||
| """ | ||
| Build HC.gov state bronze-selection targets among APTC consumers. | ||
|
|
||
| The 2024 CMS state-metal-status PUF exposes: | ||
| - metal rows (`B`, `G`, `S`) with enrollment_status=`All` | ||
| - aggregate rows (`All`) broken out by enrollment status (`01-atv`, etc.) | ||
|
|
||
| We use: | ||
| - total APTC consumers = sum of `aptc_consumers` for `metal_level == All` | ||
| across enrollment statuses | ||
| - bronze APTC consumers = `aptc_consumers` on the bronze row | ||
| """ | ||
| df = state_metal_df.copy() | ||
| df = df[df["platform"] == "HC.gov"].copy() | ||
|
|
||
| total_rows = df[ | ||
| (df["metal_level"] == "All") & (df["aptc_consumers"].notna()) | ||
| ].copy() | ||
| bronze_rows = df[ | ||
| (df["metal_level"] == "B") | ||
| & (df["enrollment_status"] == "All") | ||
| & (df["aptc_consumers"].notna()) | ||
| ].copy() | ||
|
|
||
| total_aptc = total_rows.groupby("state_code", as_index=False).agg( | ||
| marketplace_aptc_consumers=("aptc_consumers", "sum"), | ||
| marketplace_consumers=("consumers", "sum"), | ||
| ) | ||
| bronze_aptc = bronze_rows[["state_code", "aptc_consumers", "consumers"]].rename( | ||
| columns={ | ||
| "aptc_consumers": "bronze_aptc_consumers", | ||
| "consumers": "bronze_consumers", | ||
| } | ||
| ) | ||
|
|
||
| result = total_aptc.merge(bronze_aptc, on="state_code", how="inner") | ||
| result["state_fips"] = result["state_code"].map(STATE_ABBR_TO_FIPS) | ||
| result = result[result["state_fips"].notna()].copy() | ||
| result["state_fips"] = result["state_fips"].astype(int) | ||
| result["bronze_aptc_share"] = ( | ||
| result["bronze_aptc_consumers"] / result["marketplace_aptc_consumers"] | ||
| ) | ||
| result.insert(0, "year", 2024) | ||
| result.insert(1, "source", "cms_2024_oep_state_metal_status_puf") | ||
| return result.sort_values("state_code").reset_index(drop=True) | ||
|
|
||
|
|
||
| def load_state_marketplace_bronze_aptc_targets( | ||
| targets_df: pd.DataFrame, | ||
| year: int, | ||
| ) -> None: | ||
| db_url = f"sqlite:///{STORAGE_FOLDER / 'calibration' / 'policy_data.db'}" | ||
| engine = create_engine(db_url) | ||
|
|
||
| with Session(engine) as session: | ||
| geo_strata = get_geographic_strata(session) | ||
|
|
||
| for row in targets_df.itertuples(index=False): | ||
| state_fips = int(row.state_fips) | ||
| parent_id = geo_strata["state"].get(state_fips) | ||
| if parent_id is None: | ||
| logger.warning( | ||
| "No state geographic stratum for FIPS %s, skipping", state_fips | ||
| ) | ||
| continue | ||
|
|
||
| # We intentionally do not subset to `tax_unit_is_filer == 1`. | ||
| # These CMS targets describe marketplace coverage groups rather | ||
| # than the IRS filer universe, so the closest calibration entity is | ||
| # a tax unit with positive modeled APTC use. | ||
| aptc_stratum = Stratum( | ||
| parent_stratum_id=parent_id, | ||
| notes=f"State FIPS {state_fips} Marketplace APTC recipients", | ||
| ) | ||
| aptc_stratum.constraints_rel = [ | ||
| StratumConstraint( | ||
| constraint_variable="state_fips", | ||
| operation="==", | ||
| value=str(state_fips), | ||
| ), | ||
| StratumConstraint( | ||
| constraint_variable="used_aca_ptc", | ||
| operation=">", | ||
| value="0", | ||
| ), | ||
|
baogorek marked this conversation as resolved.
|
||
| ] | ||
| aptc_stratum.targets_rel.append( | ||
| Target( | ||
| # We use `tax_unit_count` rather than household/person | ||
| # counts because insurance groups map most closely to | ||
| # PolicyEngine tax units in the current calibration schema. | ||
| variable="tax_unit_count", | ||
|
baogorek marked this conversation as resolved.
|
||
| period=year, | ||
| value=float(row.marketplace_aptc_consumers), | ||
| active=True, | ||
| source="CMS 2024 OEP state metal status PUF", | ||
| notes="HC.gov APTC consumers across all enrollment statuses", | ||
| ) | ||
| ) | ||
| session.add(aptc_stratum) | ||
| session.flush() | ||
|
|
||
| bronze_stratum = Stratum( | ||
| parent_stratum_id=aptc_stratum.stratum_id, | ||
| notes=f"State FIPS {state_fips} Marketplace bronze APTC recipients", | ||
| ) | ||
| bronze_stratum.constraints_rel = [ | ||
| StratumConstraint( | ||
| constraint_variable="state_fips", | ||
| operation="==", | ||
| value=str(state_fips), | ||
| ), | ||
| StratumConstraint( | ||
| constraint_variable="used_aca_ptc", | ||
| operation=">", | ||
| value="0", | ||
| ), | ||
| StratumConstraint( | ||
| constraint_variable="selected_marketplace_plan_benchmark_ratio", | ||
| operation="<", | ||
| value=str(BENCHMARK_SILVER_RATIO), | ||
| ), | ||
| ] | ||
| bronze_stratum.targets_rel.append( | ||
| Target( | ||
| variable="tax_unit_count", | ||
| period=year, | ||
| value=float(row.bronze_aptc_consumers), | ||
| active=True, | ||
| source="CMS 2024 OEP state metal status PUF", | ||
| notes="HC.gov bronze plan selections among APTC consumers", | ||
| ) | ||
| ) | ||
| session.add(bronze_stratum) | ||
| session.flush() | ||
|
|
||
| session.commit() | ||
|
|
||
|
|
||
| def main() -> None: | ||
| args, year = etl_argparser( | ||
| "ETL for ACA marketplace bronze-selection calibration targets", | ||
| extra_args_fn=_extra_args, | ||
| ) | ||
|
|
||
| state_metal = extract_aca_marketplace_state_metal_data(args.state_metal_csv) | ||
| targets_df = build_state_marketplace_bronze_aptc_targets(state_metal) | ||
| if targets_df.empty: | ||
| raise RuntimeError("No HC.gov marketplace bronze/APTC targets were generated.") | ||
|
|
||
| print( | ||
| "Loading ACA marketplace bronze/APTC state targets for " | ||
| f"{len(targets_df)} states from {args.state_metal_csv}" | ||
| ) | ||
| load_state_marketplace_bronze_aptc_targets(targets_df, year) | ||
| print("ACA marketplace bronze/APTC targets loaded.") | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.