AI Music Detection¶
Detect AI-generated music.
Endpoint¶
Method |
URI |
Description |
|---|---|---|
POST |
|
Detect AI-generated music |
Path Parameters¶
Name |
Type |
Required |
Description |
|---|---|---|---|
|
string |
Yes |
Model to use: |
Request Body¶
Field |
Type |
Required |
Description |
|---|---|---|---|
|
binary |
Yes |
Audio file to analyze (mp3, wav, flac, m4a, aac, ogg) |
Models¶
Model |
Processing Time |
Price |
Description |
|---|---|---|---|
|
30-40 seconds |
$0.10 |
Single model, fast detection |
|
30~40 seconds |
$0.20 |
model ensemble, balanced accuracy |
|
~1 minute |
$0.50 |
model ensemble with detailed classification |
Request Example¶
cURL¶
curl https://platform.mippia.com/api/v1/ai-detection/standard \
-H "Authorization: Bearer YOUR_API_KEY" \
-X POST \
-F "file=@/path/to/audio.mp3"
Python¶
import requests
url = "https://platform.mippia.com/api/v1/ai-detection/standard"
headers = {
"Authorization": "Bearer YOUR_API_KEY"
}
files = {
"file": open("/path/to/audio.mp3", "rb")
}
response = requests.post(url, headers=headers, files=files)
print(response.json())
Response (Initial)¶
{
"task_id": "task_20251204052920_J8uNdq5z",
"status": "pending",
"filepath": "uploads/task_20251204052920_J8uNdq5z.mp3",
"model": "standard",
"created_at": "2025-12-04T05:29:20Z"
}
Callback Response (Processing)¶
{
"task_id": "task_20251204052920_J8uNdq5z",
"status": "processing",
"result": null,
"model_type": "standard"
}
Callback Response (Completed - Standard)¶
{
"task_id": "task_20251204052920_J8uNdq5z",
"status": "success",
"model_type": "standard",
"completed_at": "2025-12-04T05:30:15Z",
"result": {
"audio_filename.mp3": [
{
"overall_analysis": {
"prediction": "real",
"confidence": 0.923
},
"config": {
"model_id": "model_0",
"analysis_focus": "Waveform & Melody Pattern",
"task": "Real/Fake Binary Classification",
"num_classes": 2,
"labels": ["real", "fake"]
}
},
{
"segment_analysis": {
"prediction": ["real", "real", "real", "real"],
"confidence": [0.891, 0.912, 0.887, 0.903]
},
"overall_analysis": {
"prediction": "real",
"confidence": 0.945
},
"config": {
"model_id": "model_1",
"analysis_focus": "Waveform & Melody Pattern",
"task": "Real/Fake Binary Classification",
"num_classes": 2,
"labels": ["real", "fake"]
}
},
{
"segment_analysis": {
"prediction": ["real", "real", "real", "real"],
"confidence": [0.876, 0.901, 0.889, 0.912]
},
"overall_analysis": {
"prediction": "real",
"confidence": 0.934
},
"config": {
"model_id": "model_2",
"analysis_focus": "Waveform & Melody Pattern",
"task": "Real/Fake Binary Classification",
"num_classes": 2,
"labels": ["real", "fake"]
}
},
{
"segment_analysis": {
"prediction": ["real", "real", "real", "real"],
"confidence": [0.902, 0.918, 0.895, 0.921]
},
"overall_analysis": {
"prediction": "real",
"confidence": 0.951
},
"config": {
"model_id": "model_3",
"analysis_focus": "Waveform & Melody Pattern",
"task": "Real/Fake Binary Classification",
"num_classes": 2,
"labels": ["real", "fake"]
}
}
]
}
}
Callback Response (Completed - Pro)¶
Pro model includes additional classifiers for detailed analysis:
{
"task_id": "task_20251204052920_J8uNdq5z",
"status": "success",
"model_type": "pro",
"completed_at": "2025-12-04T05:30:45Z",
"result": {
"audio_filename.mp3": [
{
"overall_analysis": {
"prediction": "fake",
"confidence": 0.876
},
"config": {
"model_id": "model_0",
"analysis_focus": "Waveform & Melody Pattern",
"task": "Real/Fake Binary Classification",
"num_classes": 2,
"labels": ["real", "fake"]
}
},
{
"segment_analysis": {
"prediction": ["fake", "fake", "ai_cover", "fake"],
"confidence": [0.712, 0.834, 0.623, 0.789]
},
"overall_analysis": {
"prediction": "fake",
"confidence": 0.823
},
"config": {
"model_id": "model_4",
"analysis_focus": "Mixing & Audio Effects",
"task": "Real/AI-Cover/Fake 3-Class",
"num_classes": 3,
"labels": ["real", "ai_cover", "fake"]
}
},
{
"segment_analysis": {
"prediction": ["suno_v4", "suno_v4", "suno_v4_5", "suno_v4"],
"confidence": [0.534, 0.612, 0.489, 0.567]
},
"overall_analysis": {
"prediction": "suno_v4",
"confidence": 0.634
},
"config": {
"model_id": "model_5",
"analysis_focus": "Waveform & Melody Pattern",
"task": "Detailed Fake Source Classification",
"num_classes": 6,
"labels": ["real", "suno_v4", "suno_v4_5", "suno_v4_5_plus", "suno_v5", "other"]
}
},
{
"segment_analysis": {
"prediction": ["fake", "fake", "fake", "fake"],
"confidence": [0.891, 0.912, 0.878, 0.901]
},
"overall_analysis": {
"prediction": "fake",
"confidence": 0.912
},
"config": {
"model_id": "model_6",
"analysis_focus": "Waveform & Melody Pattern",
"task": "Real/AI-Cover/Fake 3-Class",
"num_classes": 3,
"labels": ["real", "ai_cover", "fake"]
}
}
]
}
}
Result Fields¶
Field |
Type |
Description |
|---|---|---|
|
string |
Unique task identifier |
|
string |
Task status: |
|
string |
Model used for detection |
|
string |
ISO 8601 completion timestamp |
|
object |
Detection results keyed by filename |
Result Structure¶
The result is an object where:
Key: Audio filename
Value: Array of model results
Each array element contains results from one model in the ensemble.
Model Result Fields¶
Field |
Type |
Description |
|---|---|---|
|
object |
Results from analyzing individual segments (Optional, There is one model which has only segment_analysis!) |
|
object |
Final prediction for the entire audio |
|
object |
Model configuration and metadata |
Config Fields¶
Field |
Type |
Description |
|---|---|---|
|
string |
Unique model identifier (e.g., |
|
string |
What the model analyzes (see Analysis Focus below) |
|
string |
Classification task type |
|
integer |
Number of output classes |
|
array |
Possible prediction labels |
Analysis Focus¶
Focus |
Description |
|---|---|
Waveform & Melody Pattern |
Analyzes audio waveform characteristics and melodic patterns |
Mixing & Audio Effects |
Analyzes mixing techniques and audio effect signatures |
Analysis Result Fields¶
Field |
Type |
Description |
|---|---|---|
|
object |
Results from analyzing individual segments of the audio |
|
array |
Prediction for each audio segment |
|
array |
Confidence score for each segment |
|
object |
Final prediction considering the entire song structure |
|
string |
Final aggregated prediction |
|
float |
Final confidence score (0.0 - 1.0) |
Classification Labels¶
Task |
Labels |
|---|---|
Real/Fake Binary |
|
3-Class |
|
Detailed Fake Source |
|
Notes¶
Supported formats: mp3, wav, flac, m4a, aac, ogg
Segment Analysis: Analyzes each segment of the audio independently to detect localized AI artifacts
Overall Analysis: Considers the entire song structure and aggregates segment results for final prediction
Result Array: Each element in the result array represents one model’s analysis, allowing clients to iterate through all model results easily