Overview #
Lip care sits at the intersection of cosmetic and quasi-drug regulation in ways that catch brand owners off guard. A tinted lip balm in the EU is a cosmetic. The same product with SPF in the US becomes an OTC drug. Add a “plumping” claim in China and you may be looking at a special cosmetic filing. We see this confusion on almost every lip care brief that comes through our lab — and the colorant approval piece alone has derailed more launch timelines than any other single issue in this category.
How Regulatory Category Determines Everything Downstream #
The first question we ask when a brand briefs us on a lip product is not “what actives do you want?” It’s “what claims are you making, and in which markets?” That answer determines the regulatory category, which determines the permitted colorant list, which determines your formula architecture. Getting this wrong at brief stage means reformulating at stability stage. We’ve seen it happen.
In the EU, lip products are governed by EU Cosmetics Regulation 1223/2009, specifically Annex IV for colorants. The critical distinction: only colorants listed in Annex IV are permitted, and many have a “lips” restriction column that limits use to non-mucous membrane application — which immediately excludes them from lip products. Brands often come to us with a Pantone reference and a mood board. We have to tell them that three of their five shades aren’t buildable with Annex IV-approved pigments at the saturation they want.
In the US, the FDA regulates lip colorants under 21 CFR Parts 73 and 74. The list is shorter than most brands expect. D&C Red No. 6 and No. 7 are workhorses of the lip category for good reason — they’re approved for lips and mucous membranes. But lakes versus dyes behave completely differently in anhydrous lip bases, and we’ve had batches where a client-specified colorant bled at 40°C within six weeks because the dye form was used in a wax matrix that couldn’t hold it. FDA Cosmetics Guidelines cover the approved color additive lists, but the practical formulation behavior is something you only learn from running the batches.
China’s NMPA Cosmetic Regulation operates on a positive list system for colorants as well, but the overlap with EU Annex IV is imperfect. We maintain a three-market colorant cross-reference internally — it’s one of the first tools we pull up on any lip brief. For brands targeting all three markets simultaneously, the intersection of approved colorants is narrower than most expect, and achieving certain warm-red or deep-berry shades within that intersection requires pigment blending strategies that add complexity to the dispersion process.
| Colorant | EU Annex IV Status | FDA 21 CFR Status | NMPA Permitted |
|---|---|---|---|
| D&C Red No. 7 (CI 15850) | Permitted, lips only | Permitted, lips & mucous membrane | Permitted |
| D&C Red No. 6 (CI 15850:1) | Permitted, lips only | Permitted, lips & mucous membrane | Permitted |
| Carmine (CI 75470) | Permitted | Permitted | Permitted (declaration required) |
| Mica (CI 77019) | Permitted | Permitted | Permitted |
| Ultramarines (CI 77007) | Permitted, not around eyes | Not permitted for lips | Not permitted |
| D&C Orange No. 5 (CI 45370) | Not listed | Permitted, lips only | Not permitted |
| Chromium Oxide Greens (CI 77288) | Permitted | Permitted | Permitted |
Ultramarines are the one that trips people up most often. They’re fine in eyeshadow, fine in body products — but not for lips in the US, and not in China either. We had a client brief a “dusty mauve” collection that relied heavily on ultramarines for the cool-grey undertone. The reformulation to achieve the same visual effect using permitted alternatives added three weeks to the development timeline and increased raw material cost by roughly 12% on those SKUs.
SPF Lip Products: Where Cosmetic Becomes Drug #
This is usually where projects go sideways. A brand wants a tinted lip balm with SPF 15. Simple enough brief. But in the US, that product is now an OTC drug under FDA monograph, which means the sunscreen actives must come from the approved monograph list, labeling must follow OTC drug labeling requirements, and the colorants must comply with both cosmetic color additive regulations and drug product requirements. Our sun protection formulation documentation covers the UV filter side in detail, but the lip-specific colorant interaction is worth addressing here.
Titanium dioxide in lip products is a good example of the complexity. As a colorant (CI 77891), it’s on the approved lists in all three markets. As a UV filter in the US OTC monograph, it’s permitted up to 25%. But the particle size and surface treatment that optimizes UV performance is different from what gives you the cleanest white base for color mixing. We run separate titanium dioxide grades for these two functions in our lab. Using the UV-grade TiO2 as your primary white pigment in a tinted formula often gives you a chalky, streaky finish that no amount of wax adjustment fixes.
In the EU, SPF lip products remain cosmetics but must comply with the SCCS Scientific Opinion recommendations on UV filter safety and the Cosmetics Regulation Annex VI for permitted UV filters. Avobenzone (butyl methoxydibenzoylmethane) is permitted in the EU at up to 3% — but in a lip product, ingestion exposure becomes a safety consideration that the SCCS has flagged in several opinions. We’re still not fully convinced the industry has converged on a clear answer for how to handle this in safety assessments, and we tell brand partners that honestly.
Consumer Perception Studies and Instrumental Measurement #
Honestly, most brands underestimate how much the claims substantiation piece matters for lip care specifically. “Moisturizing,” “plumping,” “long-lasting color” — these all require different measurement approaches, and the study design choices you make at brief stage determine what you can actually say on pack.
For moisturization claims, we use corneometry on the lip vermillion border. The Courage + Khazaka Corneometer CM 825 is our standard instrument. Baseline hydration values on lips are lower and more variable than facial skin — typically 30–55 AU in our panel population versus 45–70 AU on the cheek. A well-formulated occlusive lip balm with 5% shea butter and 3% hydrogenated castor oil can push that to 65–80 AU at the 4-hour mark. That’s a real, measurable effect. Whether it translates to a “24-hour moisture” claim depends on the 24-hour timepoint data, which we always run because regulators in the EU and increasingly in China want to see the full time-course, not just the peak.
For plumping claims, the measurement story is more complicated. Lip volume measurement using 3D profilometry (we use the Canfield VISIA-CR system with lip-specific positioning protocol) can detect volume changes of 3–5% reliably. Hyaluronic acid-based plumping serums in our experience produce 4–8% volume increase at 30 minutes post-application in responsive subjects. The problem is inter-subject variability — some subjects show 15% increase, others show nothing. A study with n=30 will often have a statistically significant mean effect that masks the fact that 40% of subjects are non-responders. We flag this to brand partners because “clinically proven plumping” based on a mean effect that half your consumers won’t experience is a claims risk.
One clinical study we ran for a peptide-enriched lip treatment (acetyl hexapeptide-3 at 4%, hyaluronic acid at 1.5%) used a split-lip design with n=42 subjects, 8-week duration, twice-daily application. Corneometry showed 23% improvement in lip hydration versus baseline at week 4, sustained at 19% at week 8. Lip volume by 3D profilometry showed 6.2% increase versus baseline at week 8. The consumer self-assessment panel (same n=42) showed 78% agreement with “my lips feel more hydrated” and 61% agreement with “my lips look fuller.” The gap between the instrumental data and the consumer perception data on the plumping endpoint is worth noting — and it’s typical. Consumers are harder to convince on volume than instruments are.
For color longevity claims, we use a Konica Minolta spectrophotometer to measure ΔE (color difference) at baseline and after standardized eating/drinking challenges. A ΔE of less than 2.0 is generally considered imperceptible to the human eye. Most conventional lip colors show ΔE of 4–8 after a standardized meal challenge. Transfer-resistant formulas using film-forming polymers can hold ΔE below 2.5 through the same challenge. That’s the number that supports a “transfer-resistant” claim. Our lip care formulation resources go into the film-former selection in more detail.
Before/After Photography Protocol #
Photography is where a lot of brands cut corners and then can’t use the images for regulatory-compliant marketing. The protocol matters as much as the formula.
We use a standardized setup: Canfield VISIA-CR imaging system, fixed focal length, standardized lighting (5500K, CRI >95), subject positioning with chin rest and forehead rest for reproducibility. Subjects remove all lip products 24 hours before baseline photography. No lip liner, no primer. Lips are gently exfoliated with a standardized protocol (wet cotton pad, 10 circular strokes) 30 minutes before baseline imaging to normalize surface texture. This sounds excessive until you’ve seen how much baseline variation in surface texture affects the apparent before/after difference.
Timepoints we recommend: baseline (T0), 2 weeks, 4 weeks, 8 weeks, 12 weeks. For immediate-effect claims (plumping, gloss), we add T+30min and T+2hr on day 1. All photography sessions are conducted at the same time of day, ±1 hour, to control for diurnal variation in lip hydration.
One pilot batch failed our photography validation because we hadn’t controlled for subject hydration status. Morning sessions after overnight fasting showed systematically lower baseline lip hydration than afternoon sessions, which inflated the apparent treatment effect in the morning cohort. We now require all subjects to consume 500mL of water in the 2 hours before each session and document it.
Designing a 12-Week Lip Care Efficacy Study #
When a brand partner asks us to design a 12-week study, the first thing we establish is the primary endpoint. Everything else — sample size, measurement frequency, statistical plan — flows from that. A study trying to prove everything proves nothing, and regulators know it.
For a moisturizing lip balm, the primary endpoint is corneometry at week 4 or week 8 (not week 12 — most hydration effects plateau before that). Secondary endpoints: TEWL at the lip border, consumer self-assessment, and photography. For a plumping treatment, primary endpoint is 3D profilometry volume at week 8. For a color product with longevity claims, the primary endpoint is ΔE after standardized challenge at week 4.
Sample size: for a corneometry primary endpoint with expected effect size of 15% improvement and standard deviation of 12% (based on our historical data), you need n=38 to achieve 80% power at α=0.05. We typically recruit n=50 to account for 20–25% dropout. For a plumping endpoint with higher variability, n=50 minimum, recruit n=65.
Study design for a 12-week lip treatment study:
Weeks 1–2: run-in period, subjects use only the provided unfragranced lip balm base (no actives), twice daily. This normalizes baseline lip condition and washes out prior product effects. Baseline measurements at end of week 2.
Weeks 3–8: active treatment phase, twice daily application of test product. Measurements at week 4 and week 8. Consumer self-assessment questionnaire at week 4 and week 8.
Weeks 9–12: extended follow-up, continue twice daily application. Final measurements at week 12. Optional: washout photography at week 14 to assess durability of effect post-discontinuation.
Statistical analysis: paired t-test for within-subject comparisons versus baseline. If split-lip design, use paired analysis on the treated versus untreated side. Adjust for multiple comparisons if more than two primary endpoints. We follow ICH Stability Guidelines principles for data integrity and documentation, even for efficacy studies — the documentation discipline transfers.
The photography protocol runs in parallel with all instrumental measurements. Every measurement timepoint gets a photography session. Images are evaluated by a blinded grader using a standardized lip condition scale (0–4 for dryness, flakiness, lip line definition, and color evenness). Inter-rater reliability should be established before the study starts — we require κ >0.7 between graders before we accept grading data.
One thing we haven’t fully solved: standardizing the eating/drinking challenge for color longevity studies across different cultural contexts. A standardized Western meal challenge (coffee, sandwich, apple) doesn’t reflect eating patterns in Asian markets where oilier foods are more typical. Our current approach uses two separate challenges, but it adds cost and complexity. It’s not a perfect solution.
Formulation Notes for Brand Partners #
What market? What are you expecting on-pack? Those are the first two questions. A “moisturizing tinted lip balm” brief means something completely different depending on whether you’re launching in Germany, the US, or China — and the colorant list, claims language, and stability requirements diverge from day one.
For EU and US dual-market launches, we start with the Annex IV / 21 CFR intersection colorant set and build the shade range from there. Don’t start with the shades and work backward to compliance — we’ve seen that approach add 6–8 weeks to development timelines. For China inclusion, add the NMPA positive list filter before any shade development begins.
On actives: if you want moisturization claims, hyaluronic acid at 1–2% combined with an occlusive base (hydrogenated castor oil, beeswax, or carnauba wax at 15–25% total wax phase) gives you reliable corneometry data. If you want plumping claims, peptide actives need to be at meaningful concentrations — acetyl hexapeptide-3 below 2% rarely moves the profilometry needle in our experience. Encapsulation for lip actives adds roughly 2.5× the raw material cost and is usually unnecessary for the delivery environment — lips are a relatively permeable barrier. Budget that cost carefully.
Packaging: airless pump for lip serums adds $0.50–$0.90 per unit at MOQ 3,000. Most indie brands absorb that. Standard doe-foot applicator tubes are fine for most lip treatment formats and keep COGS manageable.
Frequently Asked Questions #
Q: We want to launch the same tinted lip balm in the EU, US, and China — can we use one formula?
One formula is possible but the colorant set is the binding constraint. The three-market approved colorant intersection is workable for neutral and warm-red shades, but cool-toned purples and blues are very difficult. Expect to run at least two colorant variants if your shade range is broad — typically one formula covers 60–70% of shades, and edge shades need market-specific versions.
Q: Our brief says “SPF 15 lip balm” — is that straightforward?
In the US, that’s an OTC drug product the moment SPF goes on the label. That means monograph-compliant UV filters, OTC drug labeling, and a different regulatory pathway entirely. In the EU it stays cosmetic but triggers SCCS safety assessment considerations for ingestion exposure. Budget an extra 8–12 weeks for regulatory preparation on SPF lip products versus a standard cosmetic lip balm.
Q: Can we claim “clinically proven plumping” based on a consumer panel?
Consumer self-assessment alone won’t support “clinically proven” in most markets. You need instrumental data — 3D profilometry showing at least 5% volume increase at a defined timepoint, with n=30 minimum, is the threshold we work toward. The consumer panel supports the perception claim (“consumers agree lips look fuller”), the instrumental data supports the clinical claim. You need both.
Q: How long does a 12-week lip efficacy study take from brief to final report?
Add the 2-week run-in and you’re at 14 weeks of subject time. Plus 4–6 weeks for recruitment and screening, 3–4 weeks for data analysis and report writing. Realistically, 22–24 weeks from study start to final report. Plan for that in your launch timeline. Brands that brief us in Q1 expecting Q2 launch with clinical claims are usually disappointed.
Q: Carmine is on our brief as a key colorant — any issues?
Carmine (CI 75470) is approved in all three major markets, but it requires declaration as an allergen in the EU (“may cause allergic reactions” on-pack) and is a vegan/halal concern that some consumer segments actively avoid. We always flag it at brief stage. If your brand positioning is vegan or clean beauty, we’ll steer you toward synthetic alternatives — typically a blend of D&C Red No. 7 and iron oxides — that can approximate carmine shades within about ΔE 3.0 of the original.
Have a product concept in mind? Contact our formulation team to request a complimentary brief review.
© 2026 Mastracare.com. All rights reserved.
Unauthorized reproduction or distribution is prohibited.