Overview #
Regulatory classification is not a paperwork exercise. For face masks, it determines your preservative system, your claims language, your shelf-life testing protocol, and — in some markets — whether your product can legally be sold at all. We see brand partners come to us with finished briefs that are technically sound but categorically wrong for their target market. That costs time. Sometimes it costs the whole launch.
The EU, FDA, and NMPA treat face masks differently. Not slightly differently — structurally differently. A rinse-off clay mask, a leave-on hydrogel patch, and a bio-cellulose sheet mask can land in three separate regulatory buckets depending on how you write the claims and how long the product stays on skin. Getting this right at brief stage is the single most valuable thing we can do together before a single gram of formula is made.
This guide covers how we navigate classification across all three jurisdictions, what the instrumental and consumer evidence requirements look like in practice, and how to design a 12-week study that actually supports your claims.
How the Three Regulatory Frameworks Actually Classify Face Masks #
The starting point is always contact time and claims. Under EU Cosmetics Regulation 1223/2009, face masks are cosmetics by default — but the moment your claims cross into “treats,” “repairs damaged skin barrier,” or anything implying a physiological mechanism beyond surface-level action, you are in borderline territory. The EU does not have a separate “cosmeceutical” category. That word does not exist in European law. What exists is a hard line between cosmetic and medicinal product, and the SCCS has published opinions that make clear how they interpret efficacy language. We always recommend reviewing the SCCS Scientific Opinion framework before finalizing claims for EU-bound SKUs.
In the US, the FDA Cosmetics Guidelines place most face masks squarely in the cosmetic category — unless the product contains an active ingredient with a recognized OTC drug monograph function (think salicylic acid at acne-treatment concentrations). A 2% salicylic acid clay mask positioned as “acne treatment” is an OTC drug. The same formula positioned as “pore-cleansing” is a cosmetic. The claims drive the classification, not the formula. This is where we push back on brand briefs regularly.
China’s NMPA Cosmetic Regulation is the most structurally distinct of the three. Face masks fall under the 2021 Cosmetic Supervision and Administration Regulation (CSAR), and since May 2021, all new cosmetic registrations — including masks — require filing through the new CSAR framework. Rinse-off masks are generally ordinary cosmetics. Leave-on masks with whitening, anti-hair-loss, or sunscreen claims are special-use cosmetics requiring full registration, not just filing. That distinction adds 6–12 months to your China launch timeline and roughly ¥80,000–¥150,000 in registration costs. Most brands don’t factor that in at brief stage.
| Jurisdiction | Mask Category | Key Trigger for Elevated Classification | Typical Timeline |
|---|---|---|---|
| EU (1223/2009) | Cosmetic product | Medicinal claims, borderline actives above SCCS limits | 3–6 months CPNP notification |
| FDA (US) | Cosmetic or OTC Drug | OTC active ingredient + drug claim (e.g., acne treatment) | 30-day OTC registration or full NDA |
| NMPA (China) | Ordinary or Special-Use Cosmetic | Whitening, sunscreen, anti-hair-loss claims in leave-on format | 3–6 months (ordinary) / 12–18 months (special-use) |
The table above is a simplification. Real projects have edge cases. We had one brief last year — a leave-on bio-cellulose mask with tranexamic acid at 3% and a “brightening” claim. In the EU, fine. In China, that triggered special-use classification because the brightening mechanism was described in a way that implied melanin inhibition. We rewrote the claims. Same formula, different regulatory outcome.
Instrumental Measurement: What the Data Actually Looks Like #
When brand partners ask us about efficacy substantiation, the first question we ask is: what claim are you trying to support, and in which market? Because the measurement method has to match the claim. You cannot use a corneometer reading to support a “visibly reduces pores” claim. That is not how the evidence chain works.
For hydration claims — the most common face mask claim globally — we use corneometry (Corneometer CM 825) and TEWL measurement (Tewameter TM 300) as the primary instrumental pair. In a typical single-application study we run internally, a well-formulated hyaluronic acid sheet mask at 2% HA (mixed molecular weights, 50kDa and 1500kDa blend) shows a corneometer delta of +28 to +35 AU immediately post-removal, dropping to +12 to +18 AU at 4 hours. That decay curve matters. A lot of brands want to claim “24-hour hydration” based on the immediate reading. We don’t support that claim unless the 24-hour data actually holds.
For skin texture and pore appearance, we use optical profilometry (PRIMOS or Visioscan VC 98). These give you Ra and Rz values — surface roughness parameters — that can be statistically compared pre- and post-treatment. Honestly, the consumer perception of “smoother skin” correlates reasonably well with Ra reduction, but the absolute numbers are small. A 15–20% Ra reduction is clinically meaningful in our experience; below that, it’s hard to build a credible claim.
Skin tone evenness for brightening masks uses chromametry (Minolta CR-400) measuring L values. An L increase of 1.5–2.0 units is generally considered perceptible to the human eye under controlled conditions. We’re still not fully convinced that single-application brightening claims are supportable for most formulas — the L* shifts we see at single application are usually below 1.0. Repeated use over 4 weeks is where the data gets interesting.
For anti-aging or firming claims, cutometry (Cutometer MPA 580) measuring R2 (gross elasticity) and R5 (net elasticity) is the standard. These studies need to run at minimum 4 weeks, ideally 8–12 weeks, to show meaningful change. Single-application firming claims are almost always marketing language, not instrumental data.
Consumer Panel Design: Where Most Brands Get This Wrong #
Instrumental data alone does not sell products. Consumer perception data does. But designing a consumer panel study that is both scientifically credible and commercially useful is harder than it looks. This is usually where projects go sideways.
The basic structure we recommend for a face mask consumer study: minimum 30 subjects (we prefer 40–50 for statistical power), Fitzpatrick skin types II–V for global relevance, age range matched to target demographic, 4-week minimum use period with standardized application protocol (frequency, duration, removal method). Subjects complete validated self-assessment questionnaires at baseline, week 2, and week 4. The questionnaire uses a 5-point Likert scale for each claim attribute — hydration, radiance, texture, firmness — and results are reported as percentage of subjects agreeing or strongly agreeing.
One clinical study we reference for sheet mask hydration substantiation: a double-blind, randomized, vehicle-controlled trial (n=42, 8 weeks, twice-weekly application) showed 67% of subjects reported “noticeably more hydrated skin” versus 31% in the vehicle control group at week 8. Corneometer readings at week 8 showed a mean delta of +19 AU versus +6 AU in control. That is the kind of data package that supports a credible hydration claim in both EU and US markets.
The failure mode we see most often: brands run a 2-week open-label study with 20 subjects and try to use that data for EU claims substantiation. The EU expects robust, well-controlled data. Twenty subjects over two weeks is not robust. The SCCS Scientific Opinion on cosmetic efficacy testing sets a clear expectation for study design quality, and underpowered studies get challenged.
Before/after photography protocol is also frequently underspecified. We require: standardized lighting (VISIA or equivalent cross-polarized system), fixed camera position with chin rest, consistent time of day (morning, pre-product application), no makeup, 30-minute acclimatization period in controlled environment (21°C, 50% RH). Deviating from any of these introduces variability that makes the images unusable for claims support. We rejected the first photography vendor on one project because they couldn’t guarantee consistent cross-polarized lighting across all time points. That sounds pedantic. It isn’t.
The Hard Truth About Leave-On vs. Rinse-Off Classification #
Contact time is the variable that changes everything. A rinse-off clay mask sitting on skin for 10–15 minutes has a fundamentally different safety and regulatory profile than a leave-on overnight sleeping mask. This seems obvious. In practice, brands blur this line constantly.
Under EU regulation, leave-on products face stricter concentration limits for many actives. Salicylic acid, for example, is permitted at 2% in rinse-off face products but only 0.5% in leave-on products per Annex III of 1223/2009. If you formulate a “sleeping mask” with 1% salicylic acid and position it as leave-on, you have a compliance problem in the EU. We catch this regularly. The brand brief says “sleeping mask,” the formula has 1.2% salicylic acid, and nobody flagged the contact time issue until we asked.
For our acid exfoliation technology formulations, this is a constant conversation. AHA concentrations that are perfectly acceptable in a 10-minute rinse-off mask become problematic in a leave-on format — both from a regulatory standpoint and from a consumer safety standpoint. At pH 3.2 with 8% glycolic acid, a leave-on product will cause sensitization in a meaningful percentage of users. That is not a hypothetical. We’ve seen the consumer complaint data from brands who launched without proper contact-time-adjusted safety assessment.
The NMPA adds another layer: in China, sheet masks are classified as rinse-off products even if the consumer leaves them on for 20–30 minutes, because the product is physically removed. Overnight sleeping masks are leave-on. That distinction affects which preservative systems are acceptable and what concentration limits apply. It’s not a perfect system. But it’s the system.
Designing a 12-Week Efficacy Study for Face Masks #
This is the section most brand partners actually need. A 12-week study is the gold standard for anti-aging, brightening, and barrier repair claims. Here is how we structure it.
Study design framework:
Week 0 (Baseline): Full instrumental panel — corneometry, TEWL, cutometry, chromametry, profilometry. Standardized photography. Baseline questionnaire. Dermatologist skin assessment. Blood draw if testing for systemic absorption (required for some NMPA registrations).
Weeks 1–12 (Treatment): Standardized application protocol — 2–3 times per week for sheet masks, daily for sleeping masks. Subjects maintain consistent skincare routine (provided by study sponsor, no actives outside study product). No sun exposure protocol — SPF 30 minimum daily, no tanning.
Week 4 and Week 8 (Interim): Abbreviated instrumental panel (corneometry + TEWL minimum), interim questionnaire. Photography. This is where you catch early responders and flag any adverse events.
Week 12 (Endpoint): Full instrumental panel repeat. Final questionnaire. Dermatologist assessment. Photography. Subject satisfaction survey.
Statistical requirements: For a 40-subject study, you need 80% power at α=0.05 to detect a 15% change in primary endpoint. Run a power calculation before finalizing n. We’ve seen brands try to run 25-subject studies and then wonder why the data doesn’t reach significance. The math doesn’t lie.
What to measure and why: For a hydration-focused mask, primary endpoint is corneometry delta at week 12 versus baseline. Secondary endpoints: TEWL, consumer perception of hydration, photography assessment. For a brightening mask, primary endpoint is L* change at week 12. For anti-aging, primary endpoint is cutometry R2 change. Don’t try to make everything a primary endpoint — that dilutes statistical power and makes the study harder to interpret.
One thing we’ve learned from running these studies: the week 8 data is often more interesting than week 12. Some actives plateau. Some show continued improvement. Knowing where your formula sits on that curve helps you write better claims and better marketing copy. It also helps us optimize the formula for the next iteration.
For brands targeting both EU and US markets, we recommend aligning the study design with ICH Stability Guidelines principles for documentation and data integrity — even though ICH is primarily a pharmaceutical framework, the documentation standards translate well and make your dossier more credible with EU notified bodies.
The ISO Standards for biological evaluation and sensory testing (ISO 10993 series for safety, ISO 11612 for sensory panels) provide additional methodological grounding that EU-focused brands should reference when designing consumer panels.
For brands also developing barrier repair and sensitive skin formulations within the mask category, the 12-week study design above can be adapted with TEWL as the primary endpoint and a sensitized skin subgroup analysis.
Formulation Notes for Brand Partners #
What market? What are you expecting on-pack? Those are the first two questions we ask every brand partner who comes to us with a face mask brief. Because the answers determine everything downstream — formula architecture, preservative system, claims language, and study design.
If you’re targeting EU and US simultaneously, we build the formula to EU limits first. EU is almost always the more restrictive jurisdiction for actives concentration. Then we check FDA OTC status for any actives. Then we assess NMPA classification if China is in scope.
For rinse-off clay masks, our standard preservative system is phenoxyethanol at 0.8% with ethylhexylglycerin at 0.3%. This passes challenge testing (ISO 11930) reliably and is accepted across EU, US, and China. For leave-on formats, we sometimes need to adjust — particularly for low-pH formulas where the preservative efficacy profile shifts.
Sheet mask essence formulation is where we see the most brief inflation. Brands want 10 actives in the essence. In practice, a well-designed essence with 3–4 actives at clinically relevant concentrations outperforms a 10-active formula where everything is at trace levels. We push back on this. Not always successfully, but we push back.
Packaging compatibility testing for face masks is non-negotiable. The pouch material, the essence volume, the folding pattern of the sheet — all of these interact with the formula. We require 3-month compatibility testing at 40°C before any production run. Worked fine at 500g lab scale. At 200kg production, we had one project where the pouch laminate was releasing trace plasticizers into the essence at elevated temperature. Caught it at stability. Would have been a recall if we hadn’t.
Frequently Asked Questions #
Q: We want to launch a sleeping mask in both the EU and China — do we need two separate formulas?
Not necessarily, but you might need two separate claims packages. The formula can often be identical, but China’s NMPA may classify it as special-use if you make whitening or anti-aging claims, which triggers a separate registration pathway. Budget 12–18 months and ¥100,000–¥150,000 for that route if it applies.
Q: Our brief calls for 2% salicylic acid in a leave-on overnight mask — is that a problem?
Yes, in the EU it is. EU Annex III limits salicylic acid to 0.5% in leave-on products. You’d need to either reduce the concentration or reposition the product as rinse-off with a defined contact time. We’d also want to discuss whether the product is crossing into OTC drug territory in the US at that concentration with acne claims.
Q: How many subjects do we actually need for a consumer perception study to support EU claims?
Minimum 30, but we recommend 40–50 for adequate statistical power. The study needs to run at least 4 weeks for most claims, 8–12 weeks for anti-aging or brightening. Single-application studies with 20 subjects will not hold up to scrutiny from EU notified bodies.
Q: Can we use the same efficacy study data for both EU and NMPA registration?
Sometimes. NMPA requires studies conducted in Chinese subjects for special-use cosmetic registration — a study run entirely on European subjects may not be accepted. For ordinary cosmetic filing in China, the requirements are less strict. We recommend designing the study with a mixed-ethnicity panel (minimum 30% Asian subjects) if you’re planning dual-market use of the data.
Q: What’s the realistic timeline from brief to finished product with a full 12-week study?
Formula development: 8–12 weeks. Stability and safety testing: 12–16 weeks (running in parallel where possible). Consumer study: 12 weeks plus 4 weeks for data analysis and reporting. Regulatory filing: 4–8 weeks for EU CPNP, 3–6 months for NMPA ordinary cosmetic. Total realistic timeline: 12–18 months for a properly substantiated launch. Brands who plan for 6 months are almost always wrong.
Have a product concept in mind? Contact our formulation team to request a complimentary brief review.
© 2026 Mastracare.com. All rights reserved.
Unauthorized reproduction or distribution is prohibited.