Identifying CUI via Regex and Sensitive Information Types

old.reddit.com / @/u/enigmaunbound, https://old.reddit.com/user/enigmaunbound

I find it cranky that MS has not written a CUI sensitive information type. I'm working on my one to help make AIP and DLP in M365 earn its pay. I have a start on this but would love any critique or suggestion.

Here is my initial swing at a RegEx. This works pretty descent for me. It grabs the CUI// type banners. My intent is to find the term CUI where there are the // and any word strings out to a white space.

^CUI\/\/\w*$

The docs also allow for the word "CUI" or "CONTROLLED" so a similar pattern

^CONTROLLED

^CUI

These are lower confidence as they are fairly generic. I don't see a way to tighten them up so would likely setup their confidence as low.

I did add some associated keywords to the medium confidence identifier. I hope this helps prevent false postitives but assumes people abide by the marking guideline. My experience has been so far you are lucking if there is a banner. You won the lottery if the marking was valid and intentional by a legit data owner.

Strings

CUI

Controlled by

DISSEM

submitted by /u/enigmaunbound
[link] [comments]

published about 1 year ago




See all items from the same source