Best practice for representing bit flag fields in input data?

Suppose I have a field that consists of a byte value, where each bit can represent a "flag": a property whose value is either true or false. In the definition of the record layout, the "parent" field (the byte) has a name, and so does each of the "child" bit flags. For example, suppose I have a field named `toppings` that occupies one byte, where each bit represents whether or not a particular topping was added to a pizza: 1. `anchovies` 2. `bacon` 3. `chilli` 4. `mushrooms` 5. `olives` 6. `pepperoni` (These names are fictional, but the structure matches actual fields in my data.) Two of the bits are currently unused. Now suppose I have the freedom to format that data in any way I choose before I get it into Splunk. Some considerations: - Should I bother including the original byte value, as a number? I'm tending towards "no", but suppose (I know, there's a lot of *supposing* going on here) we have zillions of these records, and for the foreseeable future we're only interested in whether the toppings included `bacon` or `mushrooms`, but there's a slim chance we might at some point also be interested in the others... so maybe we only break out `bacon` and `mushrooms` as separate properties for now. This runs the risk of forgetting what the other bits mean, or that their meaning has changed over time ... but it's cheaper to index fewer fields, unless you have an "all you can ingest" license. - Should the data be "sparse" or "dense"? Let's say we've decided that we're interested in all of the toppings, and that the absence of a flag means "false". One problem: record formats can change over time; new flags can appear in data, and existing flags can become obsolete. If we introduce new toppings (say, `onion` and `capers`) and we've assumed that the absence of a flag means "false", then, when we analyze our data, if we don't keep in mind when onions and capers became available, we might mistakenly think that pizza eaters before a certain date eschewed those toppings. We've lost the distinction between "false" and absent (or `null`). (More realistically, for my use case: we might mistakenly think that a particular software property was "false", when in fact that property did not even exist in the version of the software that created the log record.) - If I use a data format such as JSON that supports nested structures, should I nest the bit flags under their parent, or should I keep a flat structure? Some examples: #### Example 1: dense JSON, nested All available toppings represented. "toppings": { "anchovies": true, "bacon": true, "chilli": true, "mushrooms": true, "olives": false, "pepperoni": false } #### Example 2: dense JSON, flat "toppings_anchovies": true, "toppings_bacon": true, "toppings_chilli": true, "toppings_mushrooms": true, "toppings_olives": false, "toppings_pepperoni": false #### Example 3: sparse JSON, nested No overall byte value; only "true" properties present (others assumed "false"; literally, missing): "toppings": { "bacon": true, "mushrooms": true } #### Example 4: sparse JSON, flat There might have been other toppings. "toppings_bacon": true, "toppings_mushrooms": true #### Example 5: sparse JSON, flat, with original byte value There were other toppings than bacon and mushrooms, but you'd have to know how to interpret the byte value 240. "toppings": 240, "toppings_bacon": true, "toppings_mushrooms": true ## Summary I think the "sparse" options (especially, where missing means false) are asking for trouble, but I thought I'd at least mention these options, because indexing data costs money. So I think it's down to the "dense" options. In which case, I don't see the point in indexing the original numeric byte value. But nested or flat? Nested means less data ingested (less repetition of the `toppings` qualifier), and I don't see any problems referring to nested properties such as `toppings.anchovies`. But if I choose nested, then I think that rules out offering users the freedom of choice to ingest from either CSV or JSON, and then being able to use the same search strings in Splunk regardless of the input data format. Because the data ingested from CSV won't have the nested structure `toppings.anchovies`. Thoughts and advice welcome.

Best practice for representing bit flag fields in input data?

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112