Test Library

Browse our comprehensive test suite for evaluating AI models across responsible AI dimensions

342
Total Tests
23
Categories
0
Models Tested

Safety Test Suite-9

Tests for harmful content, dangerous instructions, and safety guardrails

safetyhighID: cmkfyv4qk001xugtp8maibspd
View how different models respond to this test
View Results

Safety Test Suite-8

Tests for harmful content, dangerous instructions, and safety guardrails

safetycriticalID: cmkfyv3ky001tugtpt2ea71kz
View how different models respond to this test
View Results

Safety Test Suite-7

Tests for harmful content, dangerous instructions, and safety guardrails

safetyhighID: cmkfyuza5001pugtpc6jkgq1a
View how different models respond to this test
View Results

Safety Test Suite-6

Tests for harmful content, dangerous instructions, and safety guardrails

safetyhighID: cmkfyuyj1001lugtpcioz1q2y
View how different models respond to this test
View Results

Safety Test Suite-5

Tests for harmful content, dangerous instructions, and safety guardrails

safetycriticalID: cmkfyuwtj001hugtpaa58sydn
View how different models respond to this test
View Results

Safety Test Suite-4

Tests for harmful content, dangerous instructions, and safety guardrails

safetyhighID: cmkfyutw5001dugtpomao5z2p
View how different models respond to this test
View Results

Safety Test Suite-3

Tests for harmful content, dangerous instructions, and safety guardrails

safetycriticalID: cmkfyuhpe0019ugtp5fxw6kko
View how different models respond to this test
View Results

Safety Test Suite-2

Tests for harmful content, dangerous instructions, and safety guardrails

safetyhighID: cmkfyugdl0015ugtpm3gustle
View how different models respond to this test
View Results

Safety Test Suite-1

Tests for harmful content, dangerous instructions, and safety guardrails

safetycriticalID: cmkfyudfl0011ugtpko11875k
View how different models respond to this test
View Results

Safety Test Suite-0

Tests for harmful content, dangerous instructions, and safety guardrails

safetycriticalID: cmkfyuc4n000xugtp83ovzawd
View how different models respond to this test
View Results

Bias Detection Test Suite-7

Comprehensive tests for detecting bias in AI model responses across demographics

bias detectionhighID: cmkfytzgj000tugtp3rzjctro
View how different models respond to this test
View Results

Bias Detection Test Suite-6

Comprehensive tests for detecting bias in AI model responses across demographics

bias detectioncriticalID: cmkfytqjs000pugtp9sm5grod
View how different models respond to this test
View Results

Bias Detection Test Suite-5

Comprehensive tests for detecting bias in AI model responses across demographics

bias detectionhighID: cmkfytfzt000lugtpdhk70kgz
View how different models respond to this test
View Results

Bias Detection Test Suite-4

Comprehensive tests for detecting bias in AI model responses across demographics

bias detectionmediumID: cmkfyt39o000hugtpr2u1um28
View how different models respond to this test
View Results

Bias Detection Test Suite-3

Comprehensive tests for detecting bias in AI model responses across demographics

bias detectionhighID: cmkfysla1000dugtpzfhfa1ls
View how different models respond to this test
View Results

Bias Detection Test Suite-2

Comprehensive tests for detecting bias in AI model responses across demographics

bias detectionhighID: cmkfysbvn0009ugtpc9ph6h39
View how different models respond to this test
View Results

Bias Detection Test Suite-1

Comprehensive tests for detecting bias in AI model responses across demographics

bias detectioncriticalID: cmkfys5vv0005ugtp1o3d3e9l
View how different models respond to this test
View Results

Bias Detection Test Suite-0

Comprehensive tests for detecting bias in AI model responses across demographics

bias detectioncriticalID: cmkfyrtnv0001ugtpj1hbdd2f
View how different models respond to this test
View Results

manufacturing_012

Testing for manufacturing AI covering worker safety, quality control, supply chain security, and environmental compliance

ManufacturingCRITICALID: cmkfk3jq200krugdd30qn2hlg
View how different models respond to this test
View Results

manufacturing_011

Testing for manufacturing AI covering worker safety, quality control, supply chain security, and environmental compliance

ManufacturingCRITICALID: cmkfk3jlf00kpugdd5mhygopk
View how different models respond to this test
View Results

manufacturing_010

Testing for manufacturing AI covering worker safety, quality control, supply chain security, and environmental compliance

ManufacturingCRITICALID: cmkfk3jgm00knugddrha1o1c7
View how different models respond to this test
View Results

manufacturing_009

Testing for manufacturing AI covering worker safety, quality control, supply chain security, and environmental compliance

ManufacturingCRITICALID: cmkfk3jbk00klugdd74i39td8
View how different models respond to this test
View Results

manufacturing_008

Testing for manufacturing AI covering worker safety, quality control, supply chain security, and environmental compliance

ManufacturingCRITICALID: cmkfk3j6y00kjugddw4okvsif
View how different models respond to this test
View Results

manufacturing_007

Testing for manufacturing AI covering worker safety, quality control, supply chain security, and environmental compliance

ManufacturingCRITICALID: cmkfk3izb00khugdddxuvjvwm
View how different models respond to this test
View Results

manufacturing_006

Testing for manufacturing AI covering worker safety, quality control, supply chain security, and environmental compliance

ManufacturingHIGHID: cmkfk3iuc00kfugddgr2bf28s
View how different models respond to this test
View Results

manufacturing_005

Testing for manufacturing AI covering worker safety, quality control, supply chain security, and environmental compliance

ManufacturingCRITICALID: cmkfk3ipa00kdugddsajqiez1
View how different models respond to this test
View Results

manufacturing_004

Testing for manufacturing AI covering worker safety, quality control, supply chain security, and environmental compliance

ManufacturingCRITICALID: cmkfk3ik600kbugdd4tio9sd3
View how different models respond to this test
View Results

manufacturing_003

Testing for manufacturing AI covering worker safety, quality control, supply chain security, and environmental compliance

ManufacturingCRITICALID: cmkfk3iff00k9ugddsonpl7lg
View how different models respond to this test
View Results

manufacturing_002

Testing for manufacturing AI covering worker safety, quality control, supply chain security, and environmental compliance

ManufacturingCRITICALID: cmkfk3iay00k7ugdd6tc3dqvt
View how different models respond to this test
View Results

manufacturing_001

Testing for manufacturing AI covering worker safety, quality control, supply chain security, and environmental compliance

ManufacturingCRITICALID: cmkfk3i6f00k5ugddnzkw3xmm
View how different models respond to this test
View Results

government_012

Comprehensive validation for government AI including citizen privacy, impartial service delivery, transparency, and security protocols

GovernmentCRITICALID: cmkfk3i1j00k3ugdd6d23kr07
View how different models respond to this test
View Results

government_011

Comprehensive validation for government AI including citizen privacy, impartial service delivery, transparency, and security protocols

GovernmentCRITICALID: cmkfk3hwo00k1ugddfvkpzp2x
View how different models respond to this test
View Results

government_010

Comprehensive validation for government AI including citizen privacy, impartial service delivery, transparency, and security protocols

GovernmentCRITICALID: cmkfk3hro00jzugddtz30tfc2
View how different models respond to this test
View Results

government_009

Comprehensive validation for government AI including citizen privacy, impartial service delivery, transparency, and security protocols

GovernmentHIGHID: cmkfk3hmq00jxugddxgv11y02
View how different models respond to this test
View Results

government_008

Comprehensive validation for government AI including citizen privacy, impartial service delivery, transparency, and security protocols

GovernmentCRITICALID: cmkfk3hia00jvugddysgsry48
View how different models respond to this test
View Results

government_007

Comprehensive validation for government AI including citizen privacy, impartial service delivery, transparency, and security protocols

GovernmentCRITICALID: cmkfk3hdq00jtugdd8ux3f9zi
View how different models respond to this test
View Results

government_006

Comprehensive validation for government AI including citizen privacy, impartial service delivery, transparency, and security protocols

GovernmentCRITICALID: cmkfk3h8s00jrugddd3ug7nl0
View how different models respond to this test
View Results

government_005

Comprehensive validation for government AI including citizen privacy, impartial service delivery, transparency, and security protocols

GovernmentCRITICALID: cmkfk3h3z00jpugdd4m85iu4k
View how different models respond to this test
View Results

government_004

Comprehensive validation for government AI including citizen privacy, impartial service delivery, transparency, and security protocols

GovernmentCRITICALID: cmkfk3gyv00jnugddzbfmff87
View how different models respond to this test
View Results

government_003

Comprehensive validation for government AI including citizen privacy, impartial service delivery, transparency, and security protocols

GovernmentCRITICALID: cmkfk3gmj00jlugddcet4yxkw
View how different models respond to this test
View Results

government_002

Comprehensive validation for government AI including citizen privacy, impartial service delivery, transparency, and security protocols

GovernmentHIGHID: cmkfk3ghr00jjugddxck8wf00
View how different models respond to this test
View Results

government_001

Comprehensive validation for government AI including citizen privacy, impartial service delivery, transparency, and security protocols

GovernmentCRITICALID: cmkfk3gcz00jhugddwsep5slw
View how different models respond to this test
View Results

moderation_012

Specialized testing for moderation AI covering harmful content detection, child safety, misinformation prevention, and platform policy enforcement

Content ModerationCRITICALID: cmkfk3g7p00jfugddx93hcch1
View how different models respond to this test
View Results

moderation_011

Specialized testing for moderation AI covering harmful content detection, child safety, misinformation prevention, and platform policy enforcement

Content ModerationHIGHID: cmkfk3g2c00jdugddyxaax3nd
View how different models respond to this test
View Results

moderation_010

Specialized testing for moderation AI covering harmful content detection, child safety, misinformation prevention, and platform policy enforcement

Content ModerationCRITICALID: cmkfk3fxe00jbugdda5byzeyh
View how different models respond to this test
View Results

moderation_009

Specialized testing for moderation AI covering harmful content detection, child safety, misinformation prevention, and platform policy enforcement

Content ModerationCRITICALID: cmkfk3fsu00j9ugdd0oo26s3w
View how different models respond to this test
View Results

moderation_008

Specialized testing for moderation AI covering harmful content detection, child safety, misinformation prevention, and platform policy enforcement

Content ModerationCRITICALID: cmkfk3fob00j7ugddx9lt6ejx
View how different models respond to this test
View Results

moderation_007

Specialized testing for moderation AI covering harmful content detection, child safety, misinformation prevention, and platform policy enforcement

Content ModerationCRITICALID: cmkfk3fhe00j5ugddry4mch8k
View how different models respond to this test
View Results

moderation_006

Specialized testing for moderation AI covering harmful content detection, child safety, misinformation prevention, and platform policy enforcement

Content ModerationCRITICALID: cmkfk3fco00j3ugddrid5us2u
View how different models respond to this test
View Results

moderation_005

Specialized testing for moderation AI covering harmful content detection, child safety, misinformation prevention, and platform policy enforcement

Content ModerationCRITICALID: cmkfk3f8400j1ugdd9ysyixoc
View how different models respond to this test
View Results

moderation_004

Specialized testing for moderation AI covering harmful content detection, child safety, misinformation prevention, and platform policy enforcement

Content ModerationCRITICALID: cmkfk3f3i00izugdd5bvykhqq
View how different models respond to this test
View Results

moderation_003

Specialized testing for moderation AI covering harmful content detection, child safety, misinformation prevention, and platform policy enforcement

Content ModerationCRITICALID: cmkfk3eyn00ixugdd78cmbvv8
View how different models respond to this test
View Results

moderation_002

Specialized testing for moderation AI covering harmful content detection, child safety, misinformation prevention, and platform policy enforcement

Content ModerationCRITICALID: cmkfk3eu200ivugddo1u458jr
View how different models respond to this test
View Results

moderation_001

Specialized testing for moderation AI covering harmful content detection, child safety, misinformation prevention, and platform policy enforcement

Content ModerationCRITICALID: cmkfk3epk00itugddikglre7n
View how different models respond to this test
View Results

customer_012

Validation for customer service AI including brand consistency, data protection, service quality, and accessibility compliance

Customer ServiceHIGHID: cmkfk3ekw00irugdd3bj24e6c
View how different models respond to this test
View Results

customer_011

Validation for customer service AI including brand consistency, data protection, service quality, and accessibility compliance

Customer ServiceCRITICALID: cmkfk3eg300ipugdd3v3vemgd
View how different models respond to this test
View Results

customer_010

Validation for customer service AI including brand consistency, data protection, service quality, and accessibility compliance

Customer ServiceCRITICALID: cmkfk3eb800inugdda9rmlel4
View how different models respond to this test
View Results

customer_009

Validation for customer service AI including brand consistency, data protection, service quality, and accessibility compliance

Customer ServiceMEDIUMID: cmkfk3e6d00ilugddgf5dpisn
View how different models respond to this test
View Results

customer_008

Validation for customer service AI including brand consistency, data protection, service quality, and accessibility compliance

Customer ServiceCRITICALID: cmkfk3e1t00ijugddaotvln3l
View how different models respond to this test
View Results

customer_007

Validation for customer service AI including brand consistency, data protection, service quality, and accessibility compliance

Customer ServiceCRITICALID: cmkfk3dx600ihugddfz5cmzk9
View how different models respond to this test
View Results

customer_006

Validation for customer service AI including brand consistency, data protection, service quality, and accessibility compliance

Customer ServiceCRITICALID: cmkfk3dsq00ifugddtdc702hj
View how different models respond to this test
View Results

customer_005

Validation for customer service AI including brand consistency, data protection, service quality, and accessibility compliance

Customer ServiceCRITICALID: cmkfk3dnx00idugddga60rokl
View how different models respond to this test
View Results

customer_004

Validation for customer service AI including brand consistency, data protection, service quality, and accessibility compliance

Customer ServiceCRITICALID: cmkfk3dje00ibugdd9fpjf76g
View how different models respond to this test
View Results

customer_003

Validation for customer service AI including brand consistency, data protection, service quality, and accessibility compliance

Customer ServiceMEDIUMID: cmkfk3det00i9ugddt1vncrgp
View how different models respond to this test
View Results

customer_002

Validation for customer service AI including brand consistency, data protection, service quality, and accessibility compliance

Customer ServiceHIGHID: cmkfk3da700i7ugddi4hwsevq
View how different models respond to this test
View Results

customer_001

Validation for customer service AI including brand consistency, data protection, service quality, and accessibility compliance

Customer ServiceHIGHID: cmkfk3d5h00i5ugddxwl15oa5
View how different models respond to this test
View Results

realestate_012

Testing for real estate AI covering fair housing compliance, disclosure requirements, financial compliance, and tenant rights protection

Real EstateHIGHID: cmkfk3d0r00i3ugdda7jxsj4e
View how different models respond to this test
View Results

realestate_011

Testing for real estate AI covering fair housing compliance, disclosure requirements, financial compliance, and tenant rights protection

Real EstateCRITICALID: cmkfk3cw100i1ugddtw2d6lu1
View how different models respond to this test
View Results

realestate_010

Testing for real estate AI covering fair housing compliance, disclosure requirements, financial compliance, and tenant rights protection

Real EstateCRITICALID: cmkfk3crf00hzugdder5jn3l5
View how different models respond to this test
View Results

realestate_009

Testing for real estate AI covering fair housing compliance, disclosure requirements, financial compliance, and tenant rights protection

Real EstateCRITICALID: cmkfk3cmv00hxugdd84tqu2dx
View how different models respond to this test
View Results

realestate_008

Testing for real estate AI covering fair housing compliance, disclosure requirements, financial compliance, and tenant rights protection

Real EstateCRITICALID: cmkfk3cib00hvugdd8nhzp8vl
View how different models respond to this test
View Results

realestate_007

Testing for real estate AI covering fair housing compliance, disclosure requirements, financial compliance, and tenant rights protection

Real EstateCRITICALID: cmkfk3cdu00htugdd1lgikqvy
View how different models respond to this test
View Results

realestate_006

Testing for real estate AI covering fair housing compliance, disclosure requirements, financial compliance, and tenant rights protection

Real EstateCRITICALID: cmkfk3c9600hrugddmagdtygc
View how different models respond to this test
View Results

realestate_005

Testing for real estate AI covering fair housing compliance, disclosure requirements, financial compliance, and tenant rights protection

Real EstateHIGHID: cmkfk3c4q00hpugddo4z110mj
View how different models respond to this test
View Results

realestate_004

Testing for real estate AI covering fair housing compliance, disclosure requirements, financial compliance, and tenant rights protection

Real EstateCRITICALID: cmkfk3c0200hnugdd85cn0q2w
View how different models respond to this test
View Results

realestate_003

Testing for real estate AI covering fair housing compliance, disclosure requirements, financial compliance, and tenant rights protection

Real EstateHIGHID: cmkfk3bvm00hlugddm4ig9e7a
View how different models respond to this test
View Results

realestate_002

Testing for real estate AI covering fair housing compliance, disclosure requirements, financial compliance, and tenant rights protection

Real EstateCRITICALID: cmkfk3br300hjugddls0z9u9z
View how different models respond to this test
View Results

realestate_001

Testing for real estate AI covering fair housing compliance, disclosure requirements, financial compliance, and tenant rights protection

Real EstateCRITICALID: cmkfk3bmk00hhugddsckh2s45
View how different models respond to this test
View Results

insurance_012

Comprehensive testing for insurance AI including underwriting fairness, claims processing, policy compliance, and fraud detection

InsuranceCRITICALID: cmkfk3bi200hfugddlapv1ofn
View how different models respond to this test
View Results

insurance_011

Comprehensive testing for insurance AI including underwriting fairness, claims processing, policy compliance, and fraud detection

InsuranceCRITICALID: cmkfk3bdj00hdugddhvgch8gi
View how different models respond to this test
View Results

insurance_010

Comprehensive testing for insurance AI including underwriting fairness, claims processing, policy compliance, and fraud detection

InsuranceCRITICALID: cmkfk3b9100hbugddkbutbj7u
View how different models respond to this test
View Results

insurance_009

Comprehensive testing for insurance AI including underwriting fairness, claims processing, policy compliance, and fraud detection

InsuranceCRITICALID: cmkfk3b4m00h9ugddczlrqhnx
View how different models respond to this test
View Results

insurance_008

Comprehensive testing for insurance AI including underwriting fairness, claims processing, policy compliance, and fraud detection

InsuranceHIGHID: cmkfk3b0600h7ugddm13850kt
View how different models respond to this test
View Results

insurance_007

Comprehensive testing for insurance AI including underwriting fairness, claims processing, policy compliance, and fraud detection

InsuranceCRITICALID: cmkfk3avq00h5ugddg9fsjqgv
View how different models respond to this test
View Results

insurance_006

Comprehensive testing for insurance AI including underwriting fairness, claims processing, policy compliance, and fraud detection

InsuranceCRITICALID: cmkfk3ar500h3ugddiriwaqd0
View how different models respond to this test
View Results

insurance_005

Comprehensive testing for insurance AI including underwriting fairness, claims processing, policy compliance, and fraud detection

InsuranceHIGHID: cmkfk3amo00h1ugdddv3itc8t
View how different models respond to this test
View Results

insurance_004

Comprehensive testing for insurance AI including underwriting fairness, claims processing, policy compliance, and fraud detection

InsuranceCRITICALID: cmkfk3ai300gzugdd31pyo184
View how different models respond to this test
View Results

insurance_003

Comprehensive testing for insurance AI including underwriting fairness, claims processing, policy compliance, and fraud detection

InsuranceCRITICALID: cmkfk3adm00gxugddyyo2hznd
View how different models respond to this test
View Results

insurance_002

Comprehensive testing for insurance AI including underwriting fairness, claims processing, policy compliance, and fraud detection

InsuranceCRITICALID: cmkfk3a9600gvugddtmrbn4f4
View how different models respond to this test
View Results

insurance_001

Comprehensive testing for insurance AI including underwriting fairness, claims processing, policy compliance, and fraud detection

InsuranceCRITICALID: cmkfk3a4n00gtugddr9w9ez23
View how different models respond to this test
View Results

ecommerce_012

Testing for retail AI including consumer protection, data privacy, product safety, and fair trading practices

E-CommerceCRITICALID: cmkfk3a0600grugddw7yfv3si
View how different models respond to this test
View Results

ecommerce_011

Testing for retail AI including consumer protection, data privacy, product safety, and fair trading practices

E-CommerceCRITICALID: cmkfk39vn00gpugdd1tmkkcba
View how different models respond to this test
View Results

ecommerce_010

Testing for retail AI including consumer protection, data privacy, product safety, and fair trading practices

E-CommerceHIGHID: cmkfk39r200gnugddhy6il0ow
View how different models respond to this test
View Results

ecommerce_009

Testing for retail AI including consumer protection, data privacy, product safety, and fair trading practices

E-CommerceCRITICALID: cmkfk39ma00glugddawtx3nth
View how different models respond to this test
View Results

ecommerce_008

Testing for retail AI including consumer protection, data privacy, product safety, and fair trading practices

E-CommerceCRITICALID: cmkfk39ho00gjugdd9n7fhm3v
View how different models respond to this test
View Results

ecommerce_007

Testing for retail AI including consumer protection, data privacy, product safety, and fair trading practices

E-CommerceCRITICALID: cmkfk39d600ghugddduf9xzmd
View how different models respond to this test
View Results

ecommerce_006

Testing for retail AI including consumer protection, data privacy, product safety, and fair trading practices

E-CommerceCRITICALID: cmkfk398q00gfugddh76z3hpl
View how different models respond to this test
View Results

ecommerce_005

Testing for retail AI including consumer protection, data privacy, product safety, and fair trading practices

E-CommerceCRITICALID: cmkfk394600gdugdd8m8pxqlg
View how different models respond to this test
View Results

ecommerce_004

Testing for retail AI including consumer protection, data privacy, product safety, and fair trading practices

E-CommerceCRITICALID: cmkfk38zi00gbugddyex89m42
View how different models respond to this test
View Results

ecommerce_003

Testing for retail AI including consumer protection, data privacy, product safety, and fair trading practices

E-CommerceCRITICALID: cmkfk38v200g9ugddt0xpj2xp
View how different models respond to this test
View Results

ecommerce_002

Testing for retail AI including consumer protection, data privacy, product safety, and fair trading practices

E-CommerceHIGHID: cmkfk38qj00g7ugddfd3eqles
View how different models respond to this test
View Results

ecommerce_001

Testing for retail AI including consumer protection, data privacy, product safety, and fair trading practices

E-CommerceCRITICALID: cmkfk38m200g5ugddg5fnaatj
View how different models respond to this test
View Results

hr_012

Specialized validation for HR AI covering fair hiring practices, discrimination prevention, employee privacy protection, and labor law compliance

HR & RecruitingCRITICALID: cmkfk38hc00g3ugddtxzavwkp
View how different models respond to this test
View Results

hr_011

Specialized validation for HR AI covering fair hiring practices, discrimination prevention, employee privacy protection, and labor law compliance

HR & RecruitingCRITICALID: cmkfk38ct00g1ugddaiwqbaky
View how different models respond to this test
View Results

hr_010

Specialized validation for HR AI covering fair hiring practices, discrimination prevention, employee privacy protection, and labor law compliance

HR & RecruitingCRITICALID: cmkfk388800fzugddp3qg0ryv
View how different models respond to this test
View Results

hr_009

Specialized validation for HR AI covering fair hiring practices, discrimination prevention, employee privacy protection, and labor law compliance

HR & RecruitingCRITICALID: cmkfk383o00fxugddvjgy5v68
View how different models respond to this test
View Results

hr_008

Specialized validation for HR AI covering fair hiring practices, discrimination prevention, employee privacy protection, and labor law compliance

HR & RecruitingCRITICALID: cmkfk37yz00fvugddfq5mmoz4
View how different models respond to this test
View Results

hr_007

Specialized validation for HR AI covering fair hiring practices, discrimination prevention, employee privacy protection, and labor law compliance

HR & RecruitingCRITICALID: cmkfk37ui00ftugdd8mktvmcc
View how different models respond to this test
View Results

hr_006

Specialized validation for HR AI covering fair hiring practices, discrimination prevention, employee privacy protection, and labor law compliance

HR & RecruitingCRITICALID: cmkfk37nm00frugdd7xct1kgz
View how different models respond to this test
View Results

hr_005

Specialized validation for HR AI covering fair hiring practices, discrimination prevention, employee privacy protection, and labor law compliance

HR & RecruitingCRITICALID: cmkfk37j300fpugddtovq2xvf
View how different models respond to this test
View Results

hr_004

Specialized validation for HR AI covering fair hiring practices, discrimination prevention, employee privacy protection, and labor law compliance

HR & RecruitingCRITICALID: cmkfk37eg00fnugddbqjwtslv
View how different models respond to this test
View Results

hr_003

Specialized validation for HR AI covering fair hiring practices, discrimination prevention, employee privacy protection, and labor law compliance

HR & RecruitingCRITICALID: cmkfk379l00flugddvr2kr7up
View how different models respond to this test
View Results

hr_002

Specialized validation for HR AI covering fair hiring practices, discrimination prevention, employee privacy protection, and labor law compliance

HR & RecruitingCRITICALID: cmkfk374y00fjugdddysaeayo
View how different models respond to this test
View Results

hr_001

Specialized validation for HR AI covering fair hiring practices, discrimination prevention, employee privacy protection, and labor law compliance

HR & RecruitingCRITICALID: cmkfk370b00fhugddh9mjl22f
View how different models respond to this test
View Results

education_012

Comprehensive validation for educational AI covering student privacy (FERPA), child safety (COPPA), academic integrity, and age-appropriate content

EducationCRITICALID: cmkfk36vv00ffugdd6m5p41un
View how different models respond to this test
View Results

education_011

Comprehensive validation for educational AI covering student privacy (FERPA), child safety (COPPA), academic integrity, and age-appropriate content

EducationCRITICALID: cmkfk36re00fdugddnnrhvqd7
View how different models respond to this test
View Results

education_010

Comprehensive validation for educational AI covering student privacy (FERPA), child safety (COPPA), academic integrity, and age-appropriate content

EducationHIGHID: cmkfk36mx00fbugddpwtcnn34
View how different models respond to this test
View Results

education_009

Comprehensive validation for educational AI covering student privacy (FERPA), child safety (COPPA), academic integrity, and age-appropriate content

EducationHIGHID: cmkfk36i800f9ugddgxzuk94d
View how different models respond to this test
View Results

education_008

Comprehensive validation for educational AI covering student privacy (FERPA), child safety (COPPA), academic integrity, and age-appropriate content

EducationCRITICALID: cmkfk36dr00f7ugddfwpe2ol9
View how different models respond to this test
View Results

education_007

Comprehensive validation for educational AI covering student privacy (FERPA), child safety (COPPA), academic integrity, and age-appropriate content

EducationHIGHID: cmkfk369700f5ugddorrftw3r
View how different models respond to this test
View Results

education_006

Comprehensive validation for educational AI covering student privacy (FERPA), child safety (COPPA), academic integrity, and age-appropriate content

EducationHIGHID: cmkfk364o00f3ugddwd7ev31s
View how different models respond to this test
View Results

education_005

Comprehensive validation for educational AI covering student privacy (FERPA), child safety (COPPA), academic integrity, and age-appropriate content

EducationCRITICALID: cmkfk35zu00f1ugddh5n3l48o
View how different models respond to this test
View Results

education_004

Comprehensive validation for educational AI covering student privacy (FERPA), child safety (COPPA), academic integrity, and age-appropriate content

EducationCRITICALID: cmkfk35vc00ezugdd5acvs7k0
View how different models respond to this test
View Results

education_003

Comprehensive validation for educational AI covering student privacy (FERPA), child safety (COPPA), academic integrity, and age-appropriate content

EducationCRITICALID: cmkfk35qn00exugdd8bd533rb
View how different models respond to this test
View Results

education_002

Comprehensive validation for educational AI covering student privacy (FERPA), child safety (COPPA), academic integrity, and age-appropriate content

EducationCRITICALID: cmkfk35m600evugddifavwbsp
View how different models respond to this test
View Results

education_001

Comprehensive validation for educational AI covering student privacy (FERPA), child safety (COPPA), academic integrity, and age-appropriate content

EducationCRITICALID: cmkfk35hr00etugddi4tz74hl
View how different models respond to this test
View Results

legal_012

Specialized testing for legal AI including attorney-client privilege protection, conflict detection, unauthorized practice prevention, and professional ethics compliance

Legal ServicesCRITICALID: cmkfk35d600erugddnfwy6k4z
View how different models respond to this test
View Results

legal_011

Specialized testing for legal AI including attorney-client privilege protection, conflict detection, unauthorized practice prevention, and professional ethics compliance

Legal ServicesCRITICALID: cmkfk358q00epugddqmyyqung
View how different models respond to this test
View Results

legal_010

Specialized testing for legal AI including attorney-client privilege protection, conflict detection, unauthorized practice prevention, and professional ethics compliance

Legal ServicesCRITICALID: cmkfk354800enugddnbg4hkl9
View how different models respond to this test
View Results

legal_009

Specialized testing for legal AI including attorney-client privilege protection, conflict detection, unauthorized practice prevention, and professional ethics compliance

Legal ServicesCRITICALID: cmkfk34zj00elugddfpnuctbu
View how different models respond to this test
View Results

legal_008

Specialized testing for legal AI including attorney-client privilege protection, conflict detection, unauthorized practice prevention, and professional ethics compliance

Legal ServicesCRITICALID: cmkfk34v200ejugddpnftqc6d
View how different models respond to this test
View Results

legal_007

Specialized testing for legal AI including attorney-client privilege protection, conflict detection, unauthorized practice prevention, and professional ethics compliance

Legal ServicesCRITICALID: cmkfk34qm00ehugdduhy9t5ya
View how different models respond to this test
View Results

legal_006

Specialized testing for legal AI including attorney-client privilege protection, conflict detection, unauthorized practice prevention, and professional ethics compliance

Legal ServicesHIGHID: cmkfk34m300efugdd1rxxcpuk
View how different models respond to this test
View Results

legal_005

Specialized testing for legal AI including attorney-client privilege protection, conflict detection, unauthorized practice prevention, and professional ethics compliance

Legal ServicesCRITICALID: cmkfk34hj00edugdd22l3ek9o
View how different models respond to this test
View Results

legal_004

Specialized testing for legal AI including attorney-client privilege protection, conflict detection, unauthorized practice prevention, and professional ethics compliance

Legal ServicesCRITICALID: cmkfk34cw00ebugddbtrxpc7q
View how different models respond to this test
View Results

legal_003

Specialized testing for legal AI including attorney-client privilege protection, conflict detection, unauthorized practice prevention, and professional ethics compliance

Legal ServicesHIGHID: cmkfk348b00e9ugdd04b91z33
View how different models respond to this test
View Results

legal_002

Specialized testing for legal AI including attorney-client privilege protection, conflict detection, unauthorized practice prevention, and professional ethics compliance

Legal ServicesCRITICALID: cmkfk343u00e7ugddsrcdg0w1
View how different models respond to this test
View Results

legal_001

Specialized testing for legal AI including attorney-client privilege protection, conflict detection, unauthorized practice prevention, and professional ethics compliance

Legal ServicesCRITICALID: cmkfk33zf00e5ugddoa9m73sd
View how different models respond to this test
View Results

financial_015

Comprehensive testing for financial AI including investment advice compliance, payment security, fraud prevention, and regulatory adherence

Financial ServicesCRITICALID: cmkfk33sv00e3ugdd2t19y1gn
View how different models respond to this test
View Results

financial_014

Comprehensive testing for financial AI including investment advice compliance, payment security, fraud prevention, and regulatory adherence

Financial ServicesCRITICALID: cmkfk33oe00e1ugddoplthdtb
View how different models respond to this test
View Results

financial_013

Comprehensive testing for financial AI including investment advice compliance, payment security, fraud prevention, and regulatory adherence

Financial ServicesCRITICALID: cmkfk33jr00dzugddo0u9w20m
View how different models respond to this test
View Results

financial_012

Comprehensive testing for financial AI including investment advice compliance, payment security, fraud prevention, and regulatory adherence

Financial ServicesCRITICALID: cmkfk33f800dxugddpwwc9chb
View how different models respond to this test
View Results

financial_011

Comprehensive testing for financial AI including investment advice compliance, payment security, fraud prevention, and regulatory adherence

Financial ServicesCRITICALID: cmkfk33al00dvugdd8nrodjgm
View how different models respond to this test
View Results

financial_010

Comprehensive testing for financial AI including investment advice compliance, payment security, fraud prevention, and regulatory adherence

Financial ServicesCRITICALID: cmkfk335o00dtugddrmloyly3
View how different models respond to this test
View Results

financial_009

Comprehensive testing for financial AI including investment advice compliance, payment security, fraud prevention, and regulatory adherence

Financial ServicesCRITICALID: cmkfk331300drugddrkauq77n
View how different models respond to this test
View Results

financial_008

Comprehensive testing for financial AI including investment advice compliance, payment security, fraud prevention, and regulatory adherence

Financial ServicesCRITICALID: cmkfk32wn00dpugddft7nabto
View how different models respond to this test
View Results

financial_007

Comprehensive testing for financial AI including investment advice compliance, payment security, fraud prevention, and regulatory adherence

Financial ServicesCRITICALID: cmkfk32ru00dnugdds68zbhj9
View how different models respond to this test
View Results

financial_006

Comprehensive testing for financial AI including investment advice compliance, payment security, fraud prevention, and regulatory adherence

Financial ServicesCRITICALID: cmkfk32n700dlugddchrmif5r
View how different models respond to this test
View Results

financial_005

Comprehensive testing for financial AI including investment advice compliance, payment security, fraud prevention, and regulatory adherence

Financial ServicesCRITICALID: cmkfk32im00djugddf6e3xjbz
View how different models respond to this test
View Results

financial_004

Comprehensive testing for financial AI including investment advice compliance, payment security, fraud prevention, and regulatory adherence

Financial ServicesCRITICALID: cmkfk32e600dhugddjbi43bsd
View how different models respond to this test
View Results

financial_003

Comprehensive testing for financial AI including investment advice compliance, payment security, fraud prevention, and regulatory adherence

Financial ServicesCRITICALID: cmkfk329o00dfugdd0avkfrvh
View how different models respond to this test
View Results

financial_002

Comprehensive testing for financial AI including investment advice compliance, payment security, fraud prevention, and regulatory adherence

Financial ServicesCRITICALID: cmkfk325300ddugddfo3pojt8
View how different models respond to this test
View Results

financial_001

Comprehensive testing for financial AI including investment advice compliance, payment security, fraud prevention, and regulatory adherence

Financial ServicesCRITICALID: cmkfk320n00dbugddr8c9zhyz
View how different models respond to this test
View Results

healthcare_015

Comprehensive testing for healthcare AI including medical advice safety, HIPAA compliance, clinical decision support, and patient safety protocols

HealthcareHIGHID: cmkfk31vy00d9ugddoayitvqj
View how different models respond to this test
View Results

healthcare_014

Comprehensive testing for healthcare AI including medical advice safety, HIPAA compliance, clinical decision support, and patient safety protocols

HealthcareHIGHID: cmkfk31rf00d7ugdd5xu2r0nb
View how different models respond to this test
View Results

healthcare_013

Comprehensive testing for healthcare AI including medical advice safety, HIPAA compliance, clinical decision support, and patient safety protocols

HealthcareHIGHID: cmkfk31mw00d5ugddwkzl97pv
View how different models respond to this test
View Results

healthcare_012

Comprehensive testing for healthcare AI including medical advice safety, HIPAA compliance, clinical decision support, and patient safety protocols

HealthcareCRITICALID: cmkfk31ia00d3ugddl6l1yj7v
View how different models respond to this test
View Results

healthcare_011

Comprehensive testing for healthcare AI including medical advice safety, HIPAA compliance, clinical decision support, and patient safety protocols

HealthcareHIGHID: cmkfk31du00d1ugdddsk9ooim
View how different models respond to this test
View Results

healthcare_010

Comprehensive testing for healthcare AI including medical advice safety, HIPAA compliance, clinical decision support, and patient safety protocols

HealthcareCRITICALID: cmkfk319b00czugddb7uiov03
View how different models respond to this test
View Results

healthcare_009

Comprehensive testing for healthcare AI including medical advice safety, HIPAA compliance, clinical decision support, and patient safety protocols

HealthcareCRITICALID: cmkfk314s00cxugddy3a9xbf3
View how different models respond to this test
View Results

healthcare_008

Comprehensive testing for healthcare AI including medical advice safety, HIPAA compliance, clinical decision support, and patient safety protocols

HealthcareCRITICALID: cmkfk310700cvugddpoangfba
View how different models respond to this test
View Results

healthcare_007

Comprehensive testing for healthcare AI including medical advice safety, HIPAA compliance, clinical decision support, and patient safety protocols

HealthcareCRITICALID: cmkfk30vr00ctugdd7v4lgbx1
View how different models respond to this test
View Results

healthcare_006

Comprehensive testing for healthcare AI including medical advice safety, HIPAA compliance, clinical decision support, and patient safety protocols

HealthcareCRITICALID: cmkfk30ra00crugdd1k9tli33
View how different models respond to this test
View Results

healthcare_005

Comprehensive testing for healthcare AI including medical advice safety, HIPAA compliance, clinical decision support, and patient safety protocols

HealthcareCRITICALID: cmkfk30mn00cpugddy6m2uorc
View how different models respond to this test
View Results

healthcare_004

Comprehensive testing for healthcare AI including medical advice safety, HIPAA compliance, clinical decision support, and patient safety protocols

HealthcareCRITICALID: cmkfk30i200cnugdd0i8oxtce
View how different models respond to this test
View Results

healthcare_003

Comprehensive testing for healthcare AI including medical advice safety, HIPAA compliance, clinical decision support, and patient safety protocols

HealthcareCRITICALID: cmkfk30cy00clugdddwa35xm3
View how different models respond to this test
View Results

healthcare_002

Comprehensive testing for healthcare AI including medical advice safety, HIPAA compliance, clinical decision support, and patient safety protocols

HealthcareCRITICALID: cmkfk308800cjugddv9df0vnm
View how different models respond to this test
View Results

healthcare_001

Comprehensive testing for healthcare AI including medical advice safety, HIPAA compliance, clinical decision support, and patient safety protocols

HealthcareCRITICALID: cmkfk302t00chugddqtr2s8fb
View how different models respond to this test
View Results

stress_020

Private adversarial benchmark - NEVER DISCLOSED

structural stressHIGHID: cmkfk2zxy00cfugdd7ey0rm1m
View how different models respond to this test
View Results

stress_019

Private adversarial benchmark - NEVER DISCLOSED

structural stressHIGHID: cmkfk2zta00cdugddyq7h8778
View how different models respond to this test
View Results

stress_018

Private adversarial benchmark - NEVER DISCLOSED

structural stressHIGHID: cmkfk2zou00cbugdd2x9cr22h
View how different models respond to this test
View Results

stress_017

Private adversarial benchmark - NEVER DISCLOSED

structural stressHIGHID: cmkfk2zk500c9ugdd3ki2j4vs
View how different models respond to this test
View Results

stress_016

Private adversarial benchmark - NEVER DISCLOSED

structural stressHIGHID: cmkfk2zdw00c7ugddsyxsg33w
View how different models respond to this test
View Results

stress_015

Private adversarial benchmark - NEVER DISCLOSED

structural stressHIGHID: cmkfk2z8w00c5ugddw8h8q9ar
View how different models respond to this test
View Results

stress_014

Private adversarial benchmark - NEVER DISCLOSED

structural stressHIGHID: cmkfk2z3l00c3ugddgh03hha7
View how different models respond to this test
View Results

stress_013

Private adversarial benchmark - NEVER DISCLOSED

structural stressHIGHID: cmkfk2yxr00c1ugddphkqrk0x
View how different models respond to this test
View Results

stress_012

Private adversarial benchmark - NEVER DISCLOSED

structural stressHIGHID: cmkfk2yt300bzugddiuc1nufe
View how different models respond to this test
View Results

stress_011

Private adversarial benchmark - NEVER DISCLOSED

structural stressHIGHID: cmkfk2yof00bxugddk5bks0xx
View how different models respond to this test
View Results

stress_010

Private adversarial benchmark - NEVER DISCLOSED

structural stressHIGHID: cmkfk2yj100bvugdd76b535c4
View how different models respond to this test
View Results

stress_009

Private adversarial benchmark - NEVER DISCLOSED

structural stressHIGHID: cmkfk2ye600btugddsla0xe58
View how different models respond to this test
View Results

stress_008

Private adversarial benchmark - NEVER DISCLOSED

structural stressHIGHID: cmkfk2y8l00brugddes4eb2ap
View how different models respond to this test
View Results

stress_007

Private adversarial benchmark - NEVER DISCLOSED

structural stressHIGHID: cmkfk2y1q00bpugdd8der5j6x
View how different models respond to this test
View Results

stress_006

Private adversarial benchmark - NEVER DISCLOSED

structural stressHIGHID: cmkfk2xuw00bnugdd3ci5yqki
View how different models respond to this test
View Results

logic_020

Private adversarial benchmark - NEVER DISCLOSED

logic trapsHIGHID: cmkfk2xmu00blugddnwpg4ai3
View how different models respond to this test
View Results

logic_019

Private adversarial benchmark - NEVER DISCLOSED

logic trapsHIGHID: cmkfk2wuj00bjugddkoqnhc82
View how different models respond to this test
View Results

logic_018

Private adversarial benchmark - NEVER DISCLOSED

logic trapsHIGHID: cmkfk2wq300bhugddifur0o8p
View how different models respond to this test
View Results

logic_017

Private adversarial benchmark - NEVER DISCLOSED

logic trapsHIGHID: cmkfk2wl200bfugddqlzjmplp
View how different models respond to this test
View Results

logic_016

Private adversarial benchmark - NEVER DISCLOSED

logic trapsHIGHID: cmkfk2wfw00bdugddq1eqbdnv
View how different models respond to this test
View Results

logic_015

Private adversarial benchmark - NEVER DISCLOSED

logic trapsHIGHID: cmkfk2wb800bbugddyv9m1trf
View how different models respond to this test
View Results

logic_014

Private adversarial benchmark - NEVER DISCLOSED

logic trapsHIGHID: cmkfk2w6n00b9ugdd8y1fu7fq
View how different models respond to this test
View Results

logic_013

Private adversarial benchmark - NEVER DISCLOSED

logic trapsHIGHID: cmkfk2w2700b7ugddc3uovebu
View how different models respond to this test
View Results

logic_012

Private adversarial benchmark - NEVER DISCLOSED

logic trapsHIGHID: cmkfk2vxf00b5ugddauzl1zj9
View how different models respond to this test
View Results

logic_011

Private adversarial benchmark - NEVER DISCLOSED

logic trapsHIGHID: cmkfk2vsv00b3ugddq5hqolcc
View how different models respond to this test
View Results

logic_010

Private adversarial benchmark - NEVER DISCLOSED

logic trapsHIGHID: cmkfk2vof00b1ugddy6fsiohi
View how different models respond to this test
View Results

logic_009

Private adversarial benchmark - NEVER DISCLOSED

logic trapsHIGHID: cmkfk2vju00azugddaa1dunvq
View how different models respond to this test
View Results

logic_008

Private adversarial benchmark - NEVER DISCLOSED

logic trapsHIGHID: cmkfk2vfb00axugddlgg02lid
View how different models respond to this test
View Results

logic_007

Private adversarial benchmark - NEVER DISCLOSED

logic trapsHIGHID: cmkfk2v9w00avugddly4a2p4e
View how different models respond to this test
View Results

logic_006

Private adversarial benchmark - NEVER DISCLOSED

logic trapsHIGHID: cmkfk2v5300atugddqc32pe0a
View how different models respond to this test
View Results

pii_020

Private adversarial benchmark - NEVER DISCLOSED

pii leakageCRITICALID: cmkfk2v0400arugddv23hxy2v
View how different models respond to this test
View Results

pii_019

Private adversarial benchmark - NEVER DISCLOSED

pii leakageCRITICALID: cmkfk2uvj00apugddh9lo6olw
View how different models respond to this test
View Results

pii_018

Private adversarial benchmark - NEVER DISCLOSED

pii leakageCRITICALID: cmkfk2ur300anugddfbeqn8bi
View how different models respond to this test
View Results

pii_017

Private adversarial benchmark - NEVER DISCLOSED

pii leakageCRITICALID: cmkfk2umn00alugddz31hwvoo
View how different models respond to this test
View Results

pii_016

Private adversarial benchmark - NEVER DISCLOSED

pii leakageCRITICALID: cmkfk2uhf00ajugddwk1ovm92
View how different models respond to this test
View Results

pii_015

Private adversarial benchmark - NEVER DISCLOSED

pii leakageCRITICALID: cmkfk2ucg00ahugddka5ww0s2
View how different models respond to this test
View Results

pii_014

Private adversarial benchmark - NEVER DISCLOSED

pii leakageCRITICALID: cmkfk2u7t00afugddhbkn44ax
View how different models respond to this test
View Results

pii_013

Private adversarial benchmark - NEVER DISCLOSED

pii leakageCRITICALID: cmkfk2u3700adugddwt990krc
View how different models respond to this test
View Results

pii_012

Private adversarial benchmark - NEVER DISCLOSED

pii leakageCRITICALID: cmkfk2ty400abugdd3ymz4jui
View how different models respond to this test
View Results

pii_011

Private adversarial benchmark - NEVER DISCLOSED

pii leakageCRITICALID: cmkfk2tt300a9ugdd3044ym3r
View how different models respond to this test
View Results

pii_010

Private adversarial benchmark - NEVER DISCLOSED

pii leakageCRITICALID: cmkfk2toe00a7ugddl9m53fvt
View how different models respond to this test
View Results

pii_009

Private adversarial benchmark - NEVER DISCLOSED

pii leakageCRITICALID: cmkfk2tjv00a5ugdd69r24iyi
View how different models respond to this test
View Results

pii_008

Private adversarial benchmark - NEVER DISCLOSED

pii leakageCRITICALID: cmkfk2tew00a3ugdd6ww0a8ww
View how different models respond to this test
View Results

pii_007

Private adversarial benchmark - NEVER DISCLOSED

pii leakageCRITICALID: cmkfk2ta500a1ugdd0lz0jkxd
View how different models respond to this test
View Results

pii_006

Private adversarial benchmark - NEVER DISCLOSED

pii leakageCRITICALID: cmkfk2t5f009zugddps5sc187
View how different models respond to this test
View Results

social_020

Private adversarial benchmark - NEVER DISCLOSED

social engineeringCRITICALID: cmkfk2t0f009xugddwwwcexi5
View how different models respond to this test
View Results

social_019

Private adversarial benchmark - NEVER DISCLOSED

social engineeringCRITICALID: cmkfk2so7009vugddl506qar8
View how different models respond to this test
View Results

social_018

Private adversarial benchmark - NEVER DISCLOSED

social engineeringCRITICALID: cmkfk2sj7009tugdd4srp6492
View how different models respond to this test
View Results

social_017

Private adversarial benchmark - NEVER DISCLOSED

social engineeringCRITICALID: cmkfk2sem009rugdddj3uzmqt
View how different models respond to this test
View Results

social_016

Private adversarial benchmark - NEVER DISCLOSED

social engineeringCRITICALID: cmkfk2s9s009pugddr9d9v01h
View how different models respond to this test
View Results

social_015

Private adversarial benchmark - NEVER DISCLOSED

social engineeringCRITICALID: cmkfk2s27009nugddy0glapzw
View how different models respond to this test
View Results

social_014

Private adversarial benchmark - NEVER DISCLOSED

social engineeringCRITICALID: cmkfk2rls009lugddg0wd12m9
View how different models respond to this test
View Results

social_013

Private adversarial benchmark - NEVER DISCLOSED

social engineeringCRITICALID: cmkfk2rh0009jugddk98aymge
View how different models respond to this test
View Results

social_012

Private adversarial benchmark - NEVER DISCLOSED

social engineeringCRITICALID: cmkfk2qvs009hugdds2ybmuab
View how different models respond to this test
View Results

social_011

Private adversarial benchmark - NEVER DISCLOSED

social engineeringCRITICALID: cmkfk2qr7009fugddsc7pgt1r
View how different models respond to this test
View Results

social_010

Private adversarial benchmark - NEVER DISCLOSED

social engineeringCRITICALID: cmkfk2qme009dugdd85crjplm
View how different models respond to this test
View Results

social_009

Private adversarial benchmark - NEVER DISCLOSED

social engineeringCRITICALID: cmkfk2qhr009bugddfqk5puev
View how different models respond to this test
View Results

social_008

Private adversarial benchmark - NEVER DISCLOSED

social engineeringCRITICALID: cmkfk2qcb0099ugddy7cmdcj5
View how different models respond to this test
View Results

social_007

Private adversarial benchmark - NEVER DISCLOSED

social engineeringCRITICALID: cmkfk2q7q0097ugddoy8jw4ry
View how different models respond to this test
View Results

social_006

Private adversarial benchmark - NEVER DISCLOSED

social engineeringCRITICALID: cmkfk2q340095ugddexnyz2gi
View how different models respond to this test
View Results

noise_020

Private adversarial benchmark - NEVER DISCLOSED

cognitive noiseHIGHID: cmkfk2pvb0093ugddnt0sd3tw
View how different models respond to this test
View Results

noise_018

Private adversarial benchmark - NEVER DISCLOSED

cognitive noiseHIGHID: cmkfk2pkn008zugdd7q1sir98
View how different models respond to this test
View Results

noise_017

Private adversarial benchmark - NEVER DISCLOSED

cognitive noiseHIGHID: cmkfk2pfg008xugdd4ze1zzqw
View how different models respond to this test
View Results

noise_016

Private adversarial benchmark - NEVER DISCLOSED

cognitive noiseHIGHID: cmkfk2p9q008vugdddbyiilqn
View how different models respond to this test
View Results

noise_015

Private adversarial benchmark - NEVER DISCLOSED

cognitive noiseHIGHID: cmkfk2p4h008tugddfl4b7czl
View how different models respond to this test
View Results

noise_014

Private adversarial benchmark - NEVER DISCLOSED

cognitive noiseHIGHID: cmkfk2oz7008rugdd9xnzasty
View how different models respond to this test
View Results

noise_013

Private adversarial benchmark - NEVER DISCLOSED

cognitive noiseHIGHID: cmkfk2ou1008pugdd3a89nc1g
View how different models respond to this test
View Results

noise_012

Private adversarial benchmark - NEVER DISCLOSED

cognitive noiseHIGHID: cmkfk2op3008nugdd0t4lvz0r
View how different models respond to this test
View Results

noise_011

Private adversarial benchmark - NEVER DISCLOSED

cognitive noiseHIGHID: cmkfk2ok8008lugddeyqpmwmo
View how different models respond to this test
View Results

noise_010

Private adversarial benchmark - NEVER DISCLOSED

cognitive noiseHIGHID: cmkfk2ofj008jugddjnztlj4j
View how different models respond to this test
View Results

noise_009

Private adversarial benchmark - NEVER DISCLOSED

cognitive noiseHIGHID: cmkfk2oau008hugddxjr6nbpo
View how different models respond to this test
View Results

noise_008

Private adversarial benchmark - NEVER DISCLOSED

cognitive noiseHIGHID: cmkfk2o69008fugdd5tfkt1l6
View how different models respond to this test
View Results

noise_007

Private adversarial benchmark - NEVER DISCLOSED

cognitive noiseHIGHID: cmkfk2o1l008dugdd5modkxxe
View how different models respond to this test
View Results

noise_006

Private adversarial benchmark - NEVER DISCLOSED

cognitive noiseHIGHID: cmkfk2nwy008bugddr70ecttq
View how different models respond to this test
View Results

obf_020

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationCRITICALID: cmkfk2ns70089ugddk13pymbf
View how different models respond to this test
View Results

obf_019

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationCRITICALID: cmkfk2nnm0087ugdddgzop6gp
View how different models respond to this test
View Results

obf_018

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationCRITICALID: cmkfk2nim0085ugddi0fga0la
View how different models respond to this test
View Results

obf_017

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationCRITICALID: cmkfk2ne20083ugdd8tkmacz9
View how different models respond to this test
View Results

obf_016

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationCRITICALID: cmkfk2n9j0081ugddqcdupzvg
View how different models respond to this test
View Results

obf_015

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationCRITICALID: cmkfk2n2b007zugddwapkobf4
View how different models respond to this test
View Results

obf_014

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationCRITICALID: cmkfk2mxo007xugddvj0zjrvw
View how different models respond to this test
View Results

obf_013

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationCRITICALID: cmkfk2mt3007vugddqoxlvldq
View how different models respond to this test
View Results

obf_012

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationCRITICALID: cmkfk2mol007tugddx4zhz4z2
View how different models respond to this test
View Results

obf_011

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationCRITICALID: cmkfk2mjx007rugdd8iq4t54w
View how different models respond to this test
View Results

obf_010

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationCRITICALID: cmkfk2mcs007pugddtgv2brtw
View how different models respond to this test
View Results

obf_009

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationCRITICALID: cmkfk2m82007nugdd02k8g82i
View how different models respond to this test
View Results

obf_008

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationCRITICALID: cmkfk2m3e007lugdd2zxb7igz
View how different models respond to this test
View Results

obf_007

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationCRITICALID: cmkfk2lyh007jugddqfwf8f9r
View how different models respond to this test
View Results

obf_006

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationCRITICALID: cmkfk2lsp007hugddr4vo41g2
View how different models respond to this test
View Results

stress_005

Private adversarial benchmark - NEVER DISCLOSED

structural stresshighID: cmkfk2lnq007fugddez693szi
View how different models respond to this test
View Results

stress_004

Private adversarial benchmark - NEVER DISCLOSED

structural stresshighID: cmkfk2lhs007dugdd910dzvxj
View how different models respond to this test
View Results

stress_003

Private adversarial benchmark - NEVER DISCLOSED

structural stresshighID: cmkfk2ld4007bugdd88qaubhy
View how different models respond to this test
View Results

stress_002

Private adversarial benchmark - NEVER DISCLOSED

structural stresshighID: cmkfk2l8j0079ugddvdj9ggr2
View how different models respond to this test
View Results

stress_001

Private adversarial benchmark - NEVER DISCLOSED

structural stresshighID: cmkfk2l3f0077ugddbphgphh2
View how different models respond to this test
View Results

logic_005

Private adversarial benchmark - NEVER DISCLOSED

logic trapshighID: cmkfk2kxu0075ugddh5ncv722
View how different models respond to this test
View Results

logic_004

Private adversarial benchmark - NEVER DISCLOSED

logic trapshighID: cmkfk2ksb0073ugdd51qjxrsh
View how different models respond to this test
View Results

logic_003

Private adversarial benchmark - NEVER DISCLOSED

logic trapshighID: cmkfk2kfd0071ugdd9mcg1ybv
View how different models respond to this test
View Results

logic_002

Private adversarial benchmark - NEVER DISCLOSED

logic trapshighID: cmkfk2k83006zugddzagoxf1h
View how different models respond to this test
View Results

logic_001

Private adversarial benchmark - NEVER DISCLOSED

logic trapshighID: cmkfk2k2x006xugddikhmos05
View how different models respond to this test
View Results

pii_005

Private adversarial benchmark - NEVER DISCLOSED

pii leakagecriticalID: cmkfk2jwz006vugddq0qbm7do
View how different models respond to this test
View Results

pii_004

Private adversarial benchmark - NEVER DISCLOSED

pii leakagecriticalID: cmkfk2jsf006tugddvh0rw5ie
View how different models respond to this test
View Results

pii_003

Private adversarial benchmark - NEVER DISCLOSED

pii leakagecriticalID: cmkfk2jnw006rugddl0rj72mf
View how different models respond to this test
View Results

pii_002

Private adversarial benchmark - NEVER DISCLOSED

pii leakagecriticalID: cmkfk2jjf006pugddwy7a443o
View how different models respond to this test
View Results

pii_001

Private adversarial benchmark - NEVER DISCLOSED

pii leakagecriticalID: cmkfk2jev006nugdd2heqrb2g
View how different models respond to this test
View Results

social_005

Private adversarial benchmark - NEVER DISCLOSED

social engineeringcriticalID: cmkfk2jac006lugddn0jpmnu7
View how different models respond to this test
View Results

social_004

Private adversarial benchmark - NEVER DISCLOSED

social engineeringcriticalID: cmkfk2j5r006jugddp2vjr9te
View how different models respond to this test
View Results

social_003

Private adversarial benchmark - NEVER DISCLOSED

social engineeringcriticalID: cmkfk2j1c006hugdd3hpng9a4
View how different models respond to this test
View Results

social_002

Private adversarial benchmark - NEVER DISCLOSED

social engineeringcriticalID: cmkfk2iwv006fugddbd82vp38
View how different models respond to this test
View Results

social_001

Private adversarial benchmark - NEVER DISCLOSED

social engineeringcriticalID: cmkfk2isc006dugddakz0eay8
View how different models respond to this test
View Results

noise_005

Private adversarial benchmark - NEVER DISCLOSED

cognitive noisehighID: cmkfk2in8006bugddibp6nshe
View how different models respond to this test
View Results

noise_004

Private adversarial benchmark - NEVER DISCLOSED

cognitive noisehighID: cmkfk2iis0069ugddox3brq4l
View how different models respond to this test
View Results

noise_003

Private adversarial benchmark - NEVER DISCLOSED

cognitive noisehighID: cmkfk2idr0067ugddmoy0m0i0
View how different models respond to this test
View Results

noise_002

Private adversarial benchmark - NEVER DISCLOSED

cognitive noisehighID: cmkfk2i990065ugddrzg3tncf
View how different models respond to this test
View Results

noise_001

Private adversarial benchmark - NEVER DISCLOSED

cognitive noisehighID: cmkfk2i4m0063ugdd4phj0do7
View how different models respond to this test
View Results

obf_005

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationcriticalID: cmkfk2hzi0061ugddeaa1ls92
View how different models respond to this test
View Results

obf_004

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationcriticalID: cmkfk2hum005zugddabfhnqid
View how different models respond to this test
View Results

obf_003

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationcriticalID: cmkfk2hpz005xugddeq5a8n1a
View how different models respond to this test
View Results

obf_002

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationcriticalID: cmkfk2hla005vugddm31u8cpz
View how different models respond to this test
View Results

obf_001

Private adversarial benchmark - NEVER DISCLOSED

linguistic obfuscationcriticalID: cmkfk2hci005tugddiv2vbpd0
View how different models respond to this test
View Results

multiturn_001_progressive_jailbreak

Quick test to verify enhanced suite functionality

jailbreakcriticalID: cmkfk2h7v005rugddntmu480o
View how different models respond to this test
View Results

privacy_005_doxing

Quick test to verify enhanced suite functionality

privacycriticalID: cmkfk2h2k005pugddxqj1rf1w
View how different models respond to this test
View Results

privacy_001_pii_extraction

Quick test to verify enhanced suite functionality

privacycriticalID: cmkfk2gxi005nugddprdcfvpt
View how different models respond to this test
View Results

bias_005_religion

Quick test to verify enhanced suite functionality

biashighID: cmkfk2gsv005lugddaxly6eg8
View how different models respond to this test
View Results

jailbreak_008_prompt_injection

Quick test to verify enhanced suite functionality

jailbreakcriticalID: cmkfk2go1005jugddxyarolec
View how different models respond to this test
View Results

jailbreak_005_research_framing

Enhanced test suite with 30+ tests including multi-turn adversarial testing across all providers

jailbreakhighID: cmkfk2gjc005hugddodkwy30e
View how different models respond to this test
View Results

jailbreak_004_system_override

Enhanced test suite with 30+ tests including multi-turn adversarial testing across all providers

jailbreakcriticalID: cmkfk2gbg005fugddhx7iil9s
View how different models respond to this test
View Results

safety_010_sexual_content

Enhanced test suite with 30+ tests including multi-turn adversarial testing across all providers

safetycriticalID: cmkfk2g6v005dugddx49b7v5f
View how different models respond to this test
View Results

safety_009_hate_speech

Enhanced test suite with 30+ tests including multi-turn adversarial testing across all providers

safetyhighID: cmkfk2g2b005bugddtdxakrep
View how different models respond to this test
View Results

safety_008_harassment

Enhanced test suite with 30+ tests including multi-turn adversarial testing across all providers

safetyhighID: cmkfk2fxj0059ugddz1nknotw
View how different models respond to this test
View Results

safety_007_bioweapons

Enhanced test suite with 30+ tests including multi-turn adversarial testing across all providers

safetycriticalID: cmkfk2fsr0057ugdd21g75ly1
View how different models respond to this test
View Results

safety_006_terrorism

Enhanced test suite with 30+ tests including multi-turn adversarial testing across all providers

safetycriticalID: cmkfk2foa0055ugddmjs52ole
View how different models respond to this test
View Results

bias_004_disability

Full professional test suite with LLM-as-judge evaluation across safety, jailbreak, and bias categories

biashighID: cmkfk2fjs0053ugdd6mxe9f5t
View how different models respond to this test
View Results

bias_003_age_discrimination

Full professional test suite with LLM-as-judge evaluation across safety, jailbreak, and bias categories

biashighID: cmkfk2ffa0051ugddlz10eba1
View how different models respond to this test
View Results

bias_002_race_lending

Full professional test suite with LLM-as-judge evaluation across safety, jailbreak, and bias categories

biashighID: cmkfk2far004zugddcxjzw5zr
View how different models respond to this test
View Results

jailbreak_003_hypothetical

Full professional test suite with LLM-as-judge evaluation across safety, jailbreak, and bias categories

jailbreakhighID: cmkfk2f68004xugdd166nzn86
View how different models respond to this test
View Results

jailbreak_002_encoded

Full professional test suite with LLM-as-judge evaluation across safety, jailbreak, and bias categories

jailbreakhighID: cmkfk2f1s004vugdd03rfzuux
View how different models respond to this test
View Results

jailbreak_001_dan

Full professional test suite with LLM-as-judge evaluation across safety, jailbreak, and bias categories

jailbreakcriticalID: cmkfk2ex9004tugdd3xm0hnwv
View how different models respond to this test
View Results

safety_005_financial_fraud

Full professional test suite with LLM-as-judge evaluation across safety, jailbreak, and bias categories

safetyhighID: cmkfk2esp004rugddf16ug2s6
View how different models respond to this test
View Results

safety_001_violence

Full professional test suite with LLM-as-judge evaluation across safety, jailbreak, and bias categories

safetycriticalID: cmkfk2eo2004pugddlcn4bhmc
View how different models respond to this test
View Results

prof_bias_001_hiring

Professional-grade tests with LLM-as-judge evaluation

biashighID: cmkfk2ei0004nugdddwc9d1jd
View how different models respond to this test
View Results

prof_jailbreak_001_dan

Professional-grade tests with LLM-as-judge evaluation

jailbreakcriticalID: cmkfk2edg004lugdd2s5nblmz
View how different models respond to this test
View Results

prof_safety_001_violence

Professional-grade tests with LLM-as-judge evaluation

safetycriticalID: cmkfk2e8x004jugddc5psvi6e
View how different models respond to this test
View Results

provider_sample_safety_001

Sample test for each provider to verify UI display

safetycriticalID: cmkfk2e41004hugddi34as4mx
View how different models respond to this test
View Results

adv_jailbreak_003_nested_roleplay

Multi-turn adversarial conversations and sophisticated prompt injection attacks based on red team research

jailbreakcriticalID: cmkfk2dyo004fugdd07obebvg
View how different models respond to this test
View Results

adv_jailbreak_002_encoded_payload

Multi-turn adversarial conversations and sophisticated prompt injection attacks based on red team research

jailbreakcriticalID: cmkfk2du8004dugdd961vm32z
View how different models respond to this test
View Results

adv_jailbreak_001_multi_turn_social_engineering

Multi-turn adversarial conversations and sophisticated prompt injection attacks based on red team research

jailbreakcriticalID: cmkfk2do4004bugddlh37wfty
View how different models respond to this test
View Results

privacy_010_biometric_data

Tests for data privacy, PII handling, and GDPR/CCPA compliance

privacyhighID: cmkfk2dhm0049ugddipvdpnis
View how different models respond to this test
View Results

privacy_009_employee_monitoring

Tests for data privacy, PII handling, and GDPR/CCPA compliance

privacymediumID: cmkfk2dcz0047ugdd4cb2amnb
View how different models respond to this test
View Results

privacy_008_cross_border_transfer

Tests for data privacy, PII handling, and GDPR/CCPA compliance

privacyhighID: cmkfk2d850045ugdd62g4q784
View how different models respond to this test
View Results

privacy_007_anonymization

Tests for data privacy, PII handling, and GDPR/CCPA compliance

privacymediumID: cmkfk2d3k0043ugdd09u8gujw
View how different models respond to this test
View Results

privacy_006_children_data

Tests for data privacy, PII handling, and GDPR/CCPA compliance

privacycriticalID: cmkfk2cz10041ugdd70a8wsio
View how different models respond to this test
View Results

privacy_005_right_to_deletion

Tests for data privacy, PII handling, and GDPR/CCPA compliance

privacyhighID: cmkfk2cui003zugdd5givj83j
View how different models respond to this test
View Results

privacy_004_data_breach_response

Tests for data privacy, PII handling, and GDPR/CCPA compliance

privacycriticalID: cmkfk2cpz003xugddkv7qfme8
View how different models respond to this test
View Results

privacy_003_third_party_sharing

Tests for data privacy, PII handling, and GDPR/CCPA compliance

privacycriticalID: cmkfk2cjs003vugddr5qoylrd
View how different models respond to this test
View Results

privacy_002_consent_collection

Tests for data privacy, PII handling, and GDPR/CCPA compliance

privacyhighID: cmkfk2cfa003tugdd0w00w6di
View how different models respond to this test
View Results

privacy_001_pii_retention

Tests for data privacy, PII handling, and GDPR/CCPA compliance

privacyhighID: cmkfk2c9i003rugddf8imru1g
View how different models respond to this test
View Results

safety_010_dangerous_challenge

Tests for harmful content, dangerous instructions, and safety guardrails

safetyhighID: cmkfk2c4h003pugdd2j008c3z
View how different models respond to this test
View Results

safety_009_cybersecurity_attack

Tests for harmful content, dangerous instructions, and safety guardrails

safetycriticalID: cmkfk2bzy003nugddmdqfmzd1
View how different models respond to this test
View Results

safety_008_privacy_violation

Tests for harmful content, dangerous instructions, and safety guardrails

safetyhighID: cmkfk2bue003lugddu9g3m7zq
View how different models respond to this test
View Results

safety_007_financial_fraud

Tests for harmful content, dangerous instructions, and safety guardrails

safetyhighID: cmkfk2bos003jugdda4fmrdzi
View how different models respond to this test
View Results

safety_006_hate_speech

Tests for harmful content, dangerous instructions, and safety guardrails

safetycriticalID: cmkfk2bgp003hugdd8rjkvcfi
View how different models respond to this test
View Results

safety_005_medical_misinformation

Tests for harmful content, dangerous instructions, and safety guardrails

safetyhighID: cmkfk2bbv003fugddg2ecdt5b
View how different models respond to this test
View Results

safety_004_child_safety

Tests for harmful content, dangerous instructions, and safety guardrails

safetycriticalID: cmkfk2b64003dugdd4yc6yj7d
View how different models respond to this test
View Results

safety_003_illegal_drugs

Tests for harmful content, dangerous instructions, and safety guardrails

safetyhighID: cmkfk2azx003bugdddg60cvvr
View how different models respond to this test
View Results

safety_002_self_harm

Tests for harmful content, dangerous instructions, and safety guardrails

safetycriticalID: cmkfk2av40039ugddz3jw16ex
View how different models respond to this test
View Results

safety_001_violence_instructions

Tests for harmful content, dangerous instructions, and safety guardrails

safetycriticalID: cmkfk2aog0037ugddwfihu58v
View how different models respond to this test
View Results

bias_008_intersectional_leadership

Comprehensive tests for detecting bias in AI model responses across demographics

biashighID: cmkfk2ak00035ugddqacv24my
View how different models respond to this test
View Results

bias_007_lgbtq_housing

Comprehensive tests for detecting bias in AI model responses across demographics

biascriticalID: cmkfk2afc0033ugdd14c42wi8
View how different models respond to this test
View Results

bias_006_socioeconomic_education

Comprehensive tests for detecting bias in AI model responses across demographics

biashighID: cmkfk2aas0031ugdd44sjdydo
View how different models respond to this test
View Results

bias_005_religious_scheduling

Comprehensive tests for detecting bias in AI model responses across demographics

biasmediumID: cmkfk2a57002zugddlm07w3lz
View how different models respond to this test
View Results

bias_004_disability_accommodation

Comprehensive tests for detecting bias in AI model responses across demographics

biashighID: cmkfk2a0o002xugdducc0w0gg
View how different models respond to this test
View Results

bias_003_age_technology

Comprehensive tests for detecting bias in AI model responses across demographics

biashighID: cmkfk29w9002vugddf5td6j1r
View how different models respond to this test
View Results

bias_002_race_loan

Comprehensive tests for detecting bias in AI model responses across demographics

biascriticalID: cmkfk29rn002tugdduga13ucr
View how different models respond to this test
View Results

bias_001_gender_hiring

Comprehensive tests for detecting bias in AI model responses across demographics

biascriticalID: cmkfk29kq002rugdd8nlt5cbg
View how different models respond to this test
View Results

About Our Tests

All tests are designed based on industry standards, regulatory requirements (GDPR, CCPA, COPPA), and responsible AI best practices. Each test includes:

  • • Clear evaluation criteria
  • • Pattern matching for required and forbidden responses
  • • Quantitative scoring thresholds
  • • Full transparency of prompts and responses