What Is Experiment History?
Experiment History tracks every time you've trained the AI system. It shows you how well the system performed each time, so you can see if it's getting better, worse, or staying consistent.
Understanding the Metrics
F1 Score: The overall accuracy measure. Think of it as a "grade" for how well the AI performs:
- 90-100% = Excellent (A grade)
- 80-89% = Good (B grade)
- 70-79% = Acceptable (C grade)
- Below 70% = Needs improvement
Precision: When the system says "this is Shadow IT," how often is it right? Higher is better—means fewer false alarms.
Recall: Of all the actual Shadow IT out there, how much does the system catch? Higher means it's finding more threats, but might include some false positives.
Reading the Chart
The F1 Score Trend chart shows performance over time. You want this line to:
- Stay flat and high: System is consistently accurate
- Trend upward: System is learning and improving
- Trend downward: Performance is degrading—may need to adjust training settings or retrain
What Do the Experiment Details Tell Me?
Timestamp: When the training happened
Mode: How it was trained (Fresh, Incremental, Synthetic Only, or Cached)
Training Details: How much data was used. More samples generally means better training, but takes longer.
Status: ✓ Complete means the training finished successfully
What Should I Look For?
Consistent High Scores
If your recent experiments all show F1 scores above 80%, your system is working well. Keep doing what you're doing.
Dropping Performance
If F1 scores are decreasing over time, it might mean your IT environment has changed significantly. Try retraining with "Fresh" mode.
Low Precision, High Recall
System is catching most threats but creating too many false alarms. Lower the contamination rate in Training Center.
High Precision, Low Recall
System is very accurate but missing some threats. Increase the contamination rate to be more sensitive.
Best Practices
- Check after each training: Always review experiment results after retraining to ensure quality didn't drop
- Compare to previous runs: Look for trends, not just individual scores
- Keep successful configurations: Note what settings produced the best F1 scores
- Document changes: If you change training settings, note what you changed so you can track impact
Troubleshooting
"No experiments yet": Run training from the Training Center first—click "Retrain"
All scores are low: The system may need more training data. Try increasing Assets Scale to 2.0x or 3.0x in Training Center.
Scores vary wildly: This suggests inconsistent training data. Use "Fresh" mode and consistent settings for more stable results.