Some Thoughts on SB 53: California's AI Whistleblower Bill
A Step Forward, With Room for Improvement
Content:
Context
In case you have not seen it: Senator Scott Wiener has introduced SB 53. Link to his announcement here. Link to text of draft bill here.
Senator Wiener's proposed legislation creates whistleblower protections for employees reporting critical risks from AI foundation models in California. This would add Chapter 5.1 (commencing with Section 1107) to Part 3 of Division 2 of the Labor Code.
The bill would prohibit AI foundation model developers (those who have trained models with at least $100M in compute) from retaliating against employees/ individuals who report a) critical risks (defined as foreseeable risks of 100+ deaths or $1B+ damages) or b) misleading statements about the management of critical risks to authorities. The bill would also require companies to establish internal anonymous reporting mechanisms with regular updates on case progress to whistleblowers.
We are looking forward to your thoughts on the bill and your feedback on our thinking - either in the comments below or under hello@oais.is.
Thanks to Claudia Wilson from the Center for AI Policy for her feedback on a previous draft of this post.
Overall Assessment
The bill represents meaningful progress toward reducing catastrophic risks from AI but still contains limitations that may undermine its effectiveness.
Its aim appears focused on minimizing objection in the legislative process, maximizing chances of passing, and covering tail-risks rather than providing comprehensive protection for whistleblowing across all AI risks. Our analysis focuses on how to improve the bill and its implementation while maintaining this core framing.
Key Recommendations
For details on why we make these recommendations, see section Limitations & Areas for Improvement or reach out under hello@oais.is.
0. Clarify/Create Capacity to Handle Reports via Hotline
Critical:
Equip the office handling whistleblowing cases via the hotline hosted at the Attorney General with technical capacity to evaluate complex AI risk claims
Create provisions for integrating third-party (technical) assessments, including from civil society organizations
Important:
Establish advisory services for potential whistleblowers to help evaluate ‘risk scale’ before submission (with these consultations also protected)
1. Expand Model Coverage
Critical:
Include all AI systems capable of causing defined harms (e.g., specialized Bio models which would currently be excluded), regardless of architecture or training costs.
Accordingly remove arbitrary $100M compute threshold, "general purpose" nature, and "breadth of data" requirements that create loopholes
2. Broaden Risk Categories and Lower Thresholds
Critical:
Focus on scale of potential harm rather than specific harm mechanisms
If specific harm mechanisms are retained: Add a "catch-all" category for any risk meeting the damage threshold, regardless of mechanism (e.g., critical infrastructure failures currently excluded)
Important:
Lower harm thresholds from current 100+ deaths or $1B+ damages to minimize ‘false-negatives’, i.e. individuals not flagging potential catastrophic risks
3. Strengthen Whistleblower Protections
Critical:
Explicitly protect good faith reporting even if risks later fall below thresholds, going further than ‘reasonable cause’
Establish protections for escalation to the public in cases of insufficient case handling, high urgency, or concerns about evidence destruction and if potential IP "leakage" is justified by public interest/scale of risk. There is precedent for this in both the EU Whistleblower Directive and in the AI world in the right to warn.
Important:
Create mechanisms for whistleblowers to safely seek external technical/legal help
Establish minimum requirements internal lab whistleblowing processes
4. Enhance Enforcement and Accountability
Critical:
Establish explicit, substantial penalties for limiting whistleblowing or retaliation proportionate to potential damage caused by limiting whistleblowing ($1B)
Require mandatory reporting to regulators of risks identified through internal channels to prevent ‘sweeping under the rug’
Important:
Clarify whether general Labor Code penalties (Section 1102.5) apply to violations
Detailed Analysis
Strengths of the Legislation
Comprehensive Life Cycle Coverage
The bill covers both development and deployment scenarios, which is crucial for addressing AI risks at all stages. Development-stage protection enables early risk identification before models reach deployment, while deployment-stage protection addresses risks that only emerge in real-world applications. This prevents companies from compartmentalizing responsibility between teams.
Accountability for Public Risk Statements
The bill creates accountability for "false or misleading statements about management of critical risk." This addresses the "safety theater" problem where companies claim robust safeguards without implementation. It enables verification of public claims about risk mitigation strategies and creates pressure for alignment between public statements and internal practices.
In combination with deployment stage risks, this creates a real basis for us gaining confidence in a statement like “if a lab doesn't have accurate monitoring in place that could be expected to identify all critical risks, then we will find out about it." This approach could form the basis of holding labs accountable to take post-deployment and usage of their models seriously.
Procedural Elements
The bill establishes clear reporting channels, including pointing to an existing hotline, which removes ambiguity about proper reporting procedures that often undermines protection. It creates standardized mechanisms across different companies, allows for anonymous reporting, lowers the threshold for reporting, and makes retaliation more challenging.
The requirement for monthly updates to whistleblowers on investigation status prevents reports from disappearing into bureaucratic processes, creates an ongoing documentation trail relevant for potential litigation, and establishes continuous engagement that increases the likelihood of substantive action.
Retaliation Protection Framework
The bill uses a "reasonable cause to believe" standard for protection, which acknowledges inherent uncertainty in predicting AI risks. It doesn't require whistleblowers to definitively prove dangers and recognizes information asymmetry between employees and management.
Importantly, the bill also shifts the burden of proof to employers in retaliation cases, addressing power imbalance between individual whistleblowers and corporate entities. This recognizes the difficulty of proving retaliatory intent and creates a stronger deterrent against subtle forms of retaliation.
The bill also provides temporary injunctive relief options, offering immediate protection during potentially lengthy legal proceedings and helping prevent irreparable career and financial damage to whistleblowers.
Limitations and Gaps
1. Overly Restricted Model Scope
The bill only applies to "foundation models" with $100M+ in compute costs, creating an arbitrary distinction unrelated to actual risk profiles. This excludes potentially harmful narrow AI systems with significant impact potential and ignores that critical danger can come from specialized systems with focused capabilities.
The approach creates avoidance opportunities through strategic structuring of training processes to remain below threshold or distributed training across multiple smaller models. As compute costs decline, dangerous models may fall below threshold while remaining capable of causing significant harm.
Recommendation 1: Include all AI systems capable of causing defined harms, regardless of architecture or training costs, and remove arbitrary compute thresholds that create loopholes.
2. Excessive Risk Thresholds
The bill requires death/injury to 100+ people or $1B+ in damages, setting a bar unreasonably high compared to other safety regulations. This creates scenarios where even significant harms fall outside protection and may lead to "false negatives" where critical risks go unreported.
The limited risk categories omit plausible harm vectors. While the bill introduces a 'critical risk' definition based on potential harmful outcomes (which is sensible), it then limits what specific forms these risks can take. This approach is problematic—we should care about damage scale rather than specific harm mechanisms.
For example, critical infrastructure failures that exceed the $1B threshold would not intuitively fall into any of the 4 specified categories. There is no obvious other whistleblower protection for an employee of a frontier lab who may want to report a foundation model being employed in a way that could cause tremendous damage through mismanaging electricity grids.
Our understanding is that these limitations might be rooted in (a) wanting to regulate models instead of companies and (b) ensuring startups/SMEs won't be affected, which could cause objection during the bill's passage. However, the regulatory burden of not retaliating against employees and educating them of their rights is extremely low, making whistleblower protection efficient regulation even with a lower harm threshold.
Recommendation 2: Focus on scale of potential harm rather than specific mechanisms, add a "catch-all" category for any risk meeting the damage threshold, and consider lowering harm thresholds from their current high levels.
3. Implementation Deficiencies
It's unclear how government offices will build technical capacity to evaluate complex AI risk claims. The technical complexity of AI systems requires specialized expertise and resource allocation for effective implementation. The bill has no provisions for building this capacity within oversight bodies and no provision for involving civil society in making these assessments.
A key question remains: Do protections apply if a risk is later found to fall under the $1B threshold - what if there was only “reasonable cause to believe” that damage could have been $500m? It might be very difficult for a researcher to assess this given the cutting-edge nature of the work—something that may end up being catastrophic may initially just "look odd." While we understand the balancing act of not receiving hundreds of reports about small risks while protecting trade secrets, we are likely only talking about hundreds or low thousands of individuals who might be in a relevant position to even make such reports at all.
The bill should be explicitly clear that protections apply if reporting is not in bad faith. The "reasonable cause to believe" standard is a good first step, but the bill could be more explicit about who determines if this standard was met and that any good faith reporting is covered.
The bill has no mandatory reporting to regulators of identified risks, permitting companies to handle potentially significant risks entirely internally. This creates potential for covering up legitimate dangers and removes external accountability. There is still the benefit of having a paper trail of internal escalation in case a risk was flagged, ignored, and then that risk materialized - but in the absence of wider regulation on catastrophic risks from AI such a paper trail may not be very impactful.
The insufficient specification of "reasonable internal process" requirements allows companies to establish minimal or ineffective reporting systems, creating wide variation in protection quality across organizations. The lack of minimum standards for report handling and investigation, coupled with no provisions for third-party technical assessment of claims, relies on potentially biased internal company evaluations. To not overburden companies with requirements, we do see value here in including the ‘compute spend’ threshold here as a guiding metric for the extent of minimum requirements.
The bill also lacks clarity on whether whistleblowers will remain anonymous (though this may be covered elsewhere in the Labor Code).
Recommendation 0 & 3: Equip the office handling reports with technical capacity, create provisions for third-party assessments, explicitly protect good faith reporting even if risks later fall below thresholds, and clarify standards for internal processes.
4. Missing Protections
The bill provides no protections for escalating to the public in cases of insufficient handling by the hotline, extremely high urgency based on good faith concerns, or worries that escalation to the hotline would lead to evidence destruction.
There are no explicit/sufficient penalties specified for retaliation or suppression of speaking up/not educating employees about their rights. This reduces deterrent effect and creates uncertainty about consequences. The maximum penalty appears to be "reasonable attorney's fees" for retaliation, though the bill might implicitly invoke other Labor Code penalties (Section 1102.5)—but this is unclear. Penalties should be adequate to the damage caused to the public through non-compliance, which the bill itself quantifies at $1B.
The bill creates no advisory services for potential whistleblowers to evaluate claims by either government or civil society third-party organizations. This places the full burden of initial risk assessment on individual employees, creates uncertainty around protection eligibility, and discourages reporting in ambiguous cases.
There are no mechanisms for whistleblowers to safely seek external technical or legal help, isolating them during critical decision-making and preventing access to necessary expertise for complex technical matters.
Finally, the bill raises questions about open-source models: If an employee of Meta notices someone built an agent on their Llama foundation model to cause catastrophic harm by tweaking model weights, what are the consequences for Meta?
Recommendation 3 & 4: Establish protections for public escalation in critical cases, create explicit penalties proportionate to potential damage, require mandatory regulatory reporting, and establish advisory services and support mechanisms for whistleblowers.
Conclusion
Senator Wiener's bill represents an important step toward protecting whistleblowers who may identify catastrophic risks from advanced AI systems. While we support its passage even in its current form, addressing the limitations identified above would, we believe, dramatically strengthen its effectiveness in protecting both whistleblowers and the public from potential AI harms. We encourage the California legislature to consider these recommendations as the bill moves through the legislative process.
About OAISIS
OAISIS is a non-profit project aimed at supporting concerned insiders at the frontier of AI. We run Third Opinion.