RFC: strict account checking

Discussion:

Simon Michael

2018-10-08 19:20:06 UTC

Cc'd from https://github.com/simonmichael/hledger/issues/891 <https://github.com/simonmichael/hledger/issues/891>, do you see any problems or simpler alternatives ?

This is a proposal for the "optional strict checking" part of #217 <https://github.com/simonmichael/hledger/issues/217>, allowing us to check account names in transactions against account declarations, and report misspelled or non-permitted account names, similar to --strict/--pedantic mode in Ledger. Here's what I'm thinking:

Proposal 1:
There are different kinds of accounts in the chart of accounts:

structural parent accounts, like assets and liabilities, needed for the tree, not normally posted to
accounts corresponding to real-world accounts, like assets:bankA:checking and liabilities:visa
imaginary subaccounts of real-world accounts, like assets:bankA:checking:rent or assets:bankA:savings:travel, used eg for budget envelopes/savings goals
accounts used to balance the books and track accounting concepts, like equity:opening balances
accounts used to categorise inflows and outflows, like revenues:salary and expenses:food:groceries:bulk:rice
The first kind (structural parents) usually should not receive postings. All the others typically can receive postings. I'm calling this an account's "usability", ie whether it may be used in postings (I haven't found a better word, ideas welcome.)

You may want to limit the creation of subaccounts deeper than what has been declared. You may want to lock down the basic chart of accounts but still allow freeform subaccounts in certain places.

Syntax should be lightweight and language neutral. Declaring account usability should be orthogonal to the other functions of account directives (display order, type etc.)

We could declare an account's usability by adding characters in the metadata area (in the account directive after two or more spaces, similar to #877 <https://github.com/simonmichael/hledger/pull/877>):

. (a period) means "this account is usable"
* (a star) means "any subaccount of this account is usable (both children and deeper descendants)"
If any such declarations are found in the journal, account usability checking is enabled and hledger will raise an error if it finds postings to accounts which have not been declared usable.

Eg:

; postings to assets or assets:bank are not allowed
account assets
account assets:bank
; postings to assets:bank:checking or any subaccount are allowed
account assets:bank:checking .*
; postings to assets:bank:savings are allowed, but not to subaccounts
account assets:bank:savings .
; postings to anything under liabilities, revenues or expenses are allowed (but not to the top-level accounts)
liabilities *
revenues *
expenses *

Proposal 2:

Alternately it could be

. "this account is usable"
* "direct children of this account are usable"
** "any subaccount of this account is usable (both children and deeper)"
This gives more control. And if we want this in future, it would be better to start with it now (switching would be disruptive.) However I think it is unnecessary precision, and it's better to keep things simple.

--
You received this message because you are subscribed to the Google Groups "hledger" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hledger+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Carel Fellinger

2018-10-10 08:17:47 UTC

Permalink

Hai Simon,

Post by Simon Michael
structural parent accounts, like assets and liabilities, needed for
the tree, not normally posted to

...

Post by Simon Michael
The first kind (structural parents) usually should not receive
postings. All the others typically can receive postings. I'm calling
this an account's "usability", ie whether it may be used in postings
(I haven't found a better word, ideas welcome.)

Gnucash uses the term placeholder to refer to accounts that aren't supposed
to contain postings themselves. None placeholders can receive postings.
Such placeholders have their use (e.g. as aggregates) so calling them
non-usable or without-usability is a bit harsh :)

Post by Simon Michael
We could declare an account's usability by adding characters in the
metadata area (in the account directive after two or more spaces,
. (a period) means "this account is usable" * (a star) means "any
subaccount of this account is usable (both children and deeper
descendants)"

Hm all that noise, why not let it be part of the account name declaration:

Proposal 3:

acc:ount:
acc:ount:name
acc:ount:name:*
acc:ount:name:more:**
acc:ount:name:less:**:one

acc and acc:ount are placeholders, the others can receive postings.
acc:ount:name can have one level of subaccounts, but acc:ount:name:more:**
can have any level of sub accounts, but acc:ount:name:less sub accounts
all must end with one.

--
groetjes, carel
--
You received this message because you are subscribed to the Google Groups "hledger" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hledger+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Zoran Zaric

2018-10-10 09:08:29 UTC

Permalink

Post by Carel Fellinger
Hai Simon,

Hi Carel,

Post by Carel Fellinger
acc:ount:name
acc:ount:name:*
acc:ount:name:more:**
acc:ount:name:less:**:one
acc and acc:ount are placeholders, the others can receive postings.
acc:ount:name can have one level of subaccounts, but acc:ount:name:more:**
can have any level of sub accounts, but acc:ount:name:less sub accounts
all must end with one.

I really like this!

Simon Michael

2018-10-10 18:21:39 UTC

Permalink

Thanks Carel.

Updating my terminology:

- postable accounts: accounts intended to receive postings
- structural/placeholder accounts: parent accounts required to create the tree (forest) structure
- real accounts: correspond to some kind of real-world account like a checking account or wallet

I realised that annotations after whitespace would need to be aligned for readability which would be a pain. Yours is elegant and the best yet. Added to https://github.com/simonmichael/hledger/issues/891 <https://github.com/simonmichael/hledger/issues/891>.

Now I notice that you have omitted the "account" keyword, was that intentional ? I expect we'll want to keep that for compatibility and clarity. (We might add an alternate form some time: "accounts" followed by indented account names.)

In this and the other proposals I'll assume account directives have the standard directive scope: until end of current file.

When would strict checking be activated in your proposal ? I'll assume: when any account directives exist.

Upgraders who already have some account directives will have to take action to keep their journals working, adding :* or :** to them and/or adding more account directives. At minimum, all top level accounts will need to be declared.

Can we think of worthwhile use cases for infix wildcards (less:**:one) ? Would you allow more complex patterns like **:foo:*:bar:**:baz:* ?

account directives have a lot to do now - providing documentation, providing autocompletions, setting account types, setting display order, setting postability (cf http://hledger.org/manual.html#declaring-accounts <http://hledger.org/manual.html#declaring-accounts>). But the interactions seem ok so far.

How would wildcards interact with display order ? "If there's any wildcard, the directive has no effect on display order" ?

I had another proposal ready which I'll share next.

Simon Michael

2018-10-10 19:02:17 UTC

Permalink

From https://github.com/simonmichael/hledger/issues/891.

Proposal 4

Maybe we don't want such detailed control, and the maintenance busywork that comes with it ? Maybe we can support diverse declaration styles a little more conveniently ? Maybe we want to offload some functionality from account directives ?

We could activate strict account checking with a new postable-accounts (or similar) directive. This specifies in a more global and perhaps more future-extensible way which area of the accounts tree can receive postings. Accounts tree here is the full tree (forest) implied by the account directives in effect, including both declared and undeclared accounts.

Here are some possible values, least restrictive first:

postable-accounts: any - unrestricted accounts, equivalent to no directive.

postable-accounts: tree-and-subs - can post to any account on the tree, plus anywhere under (above?) the tree, ie deeper subaccounts.

postable-accounts: tree - can post to any account on the tree, but no deeper.

postable-accounts: leaves-and-subs - can post to any leaf account of the tree or any deeper subaccount.

postable-accounts: leaves - can post to leaf accounts only.

postable-accounts: declared-and-subs - can post only to accounts declared with account directives, or deeper subaccounts. Possibly excluding the ones ending with : as in proposal 3.

postable-accounts: declared - as above, without the subs.

postable-accounts: declared-wildcards - could even have this, implementing proposal 3.

Carel Fellinger

2018-10-12 14:52:22 UTC

Permalink

Post by Simon Michael
From https://github.com/simonmichael/hledger/issues/891.
Proposal 4

maybe it's just me but I've a hard time getting my head around this
proposal. Most likely I'm overlooking something entirely, sorry.

So let me instead first refine my own:

proposal 3b (b as in better:)

- account decls start with the account keyword
- there are two types of wildcards: star (*) and question mark (?)
- star (*) stands for any name without colons (:)
- question mark (?) is like star, but only for known account names
- doubling the wildcard (** or ??) removes the no-colon restriction
- tripling the wildcard will expand to the longest possible name only
- names without trailing colon are postable
- names with trailing colon are structural
- hence parent accounts are structural unless declared postable elsewhere
- known accounts are names declared implicitly (*/**) or explicitly
- BONUS: double double colons (::) indicate natural folding points

account :** > any account are postable
account :*** > any leaf account is postable, but not their parents
account :?? > all known are accounts postable
account :??? > all know leaf are accounts postable, but not their parents

MIND YOU: ** and ?? expand to names not empty spaces!
MIND YOU: to know what's known all account decls have to be read
MIND YOU: the distinction between leaves and parents can only be made
after all declarations (and transactions) have been parsed

BONUS: the --depth flag provides a very crude way of 'folding' the
account hierarchy. Double double colons (::) could be used to specify
the natural place to fold the account hierarchy. And in hledger-ui's
account screen the enter key could be used to toggle between a folded
and unfolded view. Unpostable accounts would be depicted by their
trailing colon, so an unpostable natural folding point gets three.

Post by Simon Michael
We could activate strict account checking with a new
postable-accounts (or similar) directive. This specifies in a more
global and perhaps more future-extensible way which area of the
accounts tree can receive postings. Accounts tree here is the full
tree (forest) implied by the account directives in effect, including
both declared and undeclared accounts.
postable-accounts: any - unrestricted accounts, equivalent to no directive.

3b: accounts :** (any account, the default lacking any account decl)

Post by Simon Michael
postable-accounts: tree-and-subs - can post to any account on the
tree, plus anywhere under (above?) the tree, ie deeper subaccounts.

3b: accounts :?? (all known accounts in the tree)
accounts :???:** (any subaccount of the tree's leaves)

Post by Simon Michael
postable-accounts: tree - can post to any account on the tree, but no deeper.

3b: accounts :?? (all known accounts in the tree)

Post by Simon Michael
postable-accounts: leaves-and-subs - can post to any leaf account of
the tree or any deeper subaccount.

3b: accounts :??? (all know leaf accounts on the tree)
accounts :???:** (any subaccount of the tree's leaves)

Post by Simon Michael
postable-accounts: leaves - can post to leaf accounts only.

3b: accounts :??? (all know leaf accounts on the tree)

Post by Simon Michael
postable-accounts: declared-and-subs - can post only to accounts
declared with account directives, or deeper subaccounts. Possibly
excluding the ones ending with : as in proposal 3.

how does this differ from tree-and-subs ?

Post by Simon Michael
postable-accounts: declared - as above, without the subs.

how does this differ from tree ?

Post by Simon Michael
postable-accounts: declared-wildcards - could even have this,
implementing proposal 3.

This got me thinking, mmm, the scope of the decls is per file, arch.
I missed that point, and treated them as holding until overridden or
so. mmm, maybe it doesn't matter, it only takes one line per file,
and maybe this flag *should* be per file. After all, only within the
file you know what you are doing with account names.

And in your proposal 4, tree is limited per file or global?
If it's per file then I've an even harder time digging it.

In the end, I prefere a global account validity decl list.
Account expansion rules could be local per file, you can
imagine restricting account names per file, but the final
place for account name restrictions seems me to be global.

Simon Michael

2018-10-16 15:41:07 UTC

Permalink

Sorry for the delay.

Post by Carel Fellinger

Post by Simon Michael
From https://github.com/simonmichael/hledger/issues/891.
Proposal 4

maybe it's just me but I've a hard time getting my head around this
proposal. Most likely I'm overlooking something entirely, sorry.
proposal 3b (b as in better:)
- account decls start with the account keyword
- there are two types of wildcards: star (*) and question mark (?)
- star (*) stands for any name without colons (:)
- question mark (?) is like star, but only for known account names
- doubling the wildcard (** or ??) removes the no-colon restriction
- tripling the wildcard will expand to the longest possible name only
- names without trailing colon are postable
- names with trailing colon are structural
- hence parent accounts are structural unless declared postable elsewhere
- known accounts are names declared implicitly (*/**) or explicitly
- BONUS: double double colons (::) indicate natural folding points
account :** > any account are postable
account :*** > any leaf account is postable, but not their parents
account :?? > all known are accounts postable
account :??? > all know leaf are accounts postable, but not their parents
MIND YOU: ** and ?? expand to names not empty spaces!
MIND YOU: to know what's known all account decls have to be read
MIND YOU: the distinction between leaves and parents can only be made
after all declarations (and transactions) have been parsed
BONUS: the --depth flag provides a very crude way of 'folding' the
account hierarchy. Double double colons (::) could be used to specify
the natural place to fold the account hierarchy. And in hledger-ui's
account screen the enter key could be used to toggle between a folded
and unfolded view. Unpostable accounts would be depicted by their
trailing colon, so an unpostable natural folding point gets three.

This is interesting, but I think we have crossed over into "cryptic" and "too much rope" territory now. Nobody is asking for this much power yet. Even our simplest proposals will probably change with real-world testing. We should try a minimum viable enhancement or two and get more experience.

That could be a simple form of your 3 (eg: 3c, wildcards at end of account names only), or of my 4 (eg: 4b, supporting just two or three of the more practical modes).

Yours looks more immediately intuitive; mine may require looking up mode meanings in docs. I think yours is also more likely to have some inconvenient interactions with the other functions of account directives. I can't tell yet which one works best. All: PRs exploring either or both of these would be welcome.

Post by Carel Fellinger

how does this differ from tree-and-subs ?

By declared I mean accounts which have an explicit account directive, while tree means accounts anywhere on the full tree implied by declarations. Eg with "account a:b", declared includes [a:b] and tree includes [a, a:b].

Post by Carel Fellinger

Post by Simon Michael
postable-accounts: declared-wildcards - could even have this,
implementing proposal 3.

This got me thinking, mmm, the scope of the decls is per file, arch.
I missed that point, and treated them as holding until overridden or
so. mmm, maybe it doesn't matter, it only takes one line per file,
and maybe this flag *should* be per file. After all, only within the
file you know what you are doing with account names.
And in your proposal 4, tree is limited per file or global?
If it's per file then I've an even harder time digging it.
In the end, I prefere a global account validity decl list.
Account expansion rules could be local per file, you can
imagine restricting account names per file, but the final
place for account name restrictions seems me to be global.

Yes I think it, like most directives, should be "per file", ie lasting until the end of the current file as discussed in Directives doc. Even though this is inconvenient sometimes.

Currently though, at least some of the account directive's features are global over all files. Eg setting display order (I just tested).