Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combo RAM and Kconfig reduction #2849

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

petejohanson
Copy link
Contributor

Keeping this draft for now, but this is a collection of improvements to remove a few combo Kconfig flags that are instead auto-calculated at build time, and reduce the RAM usage on more constrained devices.

A Le Chiffre STM32 Build on main:

[198/198] Linking C executable zephyr/zmk.elf
Memory region         Used Size  Region Size  %age Used
           FLASH:       42636 B       128 KB     32.53%
             RAM:        8120 B        15 KB     52.86%
     RetainedMem:          0 GB          1 B      0.00%
        IDT_LIST:          0 GB         2 KB      0.00%

Same keymap with this PR:

❯ west build -d build/lc_stm32_tap_dance
[7/7] Linking C executable zephyr/zmk.elf
Memory region         Used Size  Region Size  %age Used
           FLASH:       42732 B       128 KB     32.60%
             RAM:        7736 B        15 KB     50.36%
     RetainedMem:          0 GB          1 B      0.00%
        IDT_LIST:          0 GB         2 KB      0.00%

For a savings of 384 bytes.

Needs some more cleanup, and there's room for some performance improvements for the access to the new bit fields, but this should be ready for early testing.

PR check-list

  • Branch has a clean commit history
  • Additional tests are included, if changing behaviors/core code that is testable.
  • Proper Copyright + License headers added to applicable files (Generally, we stick to "The ZMK Contributors" for copyrights to help avoid churn when files get edited)
  • Pre-commit used to check formatting of files, commit messages, etc.
  • Includes any necessary documentation changes.

Determine the max keys per combo automatically from the devicetree,
so we remove the ZMK_COMBO_MAX_KEYS_PER_COMBO Kconfig symbol.
Reference combos by index, not 32-bit pointers, and store bitfields
instead of arrays in several places, to bring down our flash/RAM usage.
Use bit field to track candidate combos, to avoid needing an explicit
`ZMK_COMBO_MAX_COMBOS_PER_KEY` setting.
int
default 0
help
Deprecated: Storage for combos is now determined automatically
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these should be deprecated. Rather I think we should take the max of the automatic determination and whatever this value is set to, for future Studio use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we have future studio use, we'll likely "pre-allocate" things by setting placeholders up in the devicetree, like we do with reserved layers, so our auto-calculation will still be usable there as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closer inspection, seems doable for MAX_COMBOS_PER_KEY, but less so for vice-versa: what if DT has only 2 and 3 key combos, and then someone wants to set up a 5 key one in Studio?

@nmunnich
Copy link
Contributor

Joel had an interesting comment here that might also apply to this PR.

@petejohanson
Copy link
Contributor Author

Joel had an interesting comment here that might also apply to this PR.

Moved to uint32_t type for the bit fields to avoid alignment issues.

app/src/combo.c Outdated
Comment on lines 181 to 189
static uint8_t number_of_set_bits(uint32_t field) {
uint8_t count = 0;
while (field) {
field &= (field - 1);
count++;
}
// clear unmatched candidates
for (int i = matches; i < CONFIG_ZMK_COMBO_MAX_COMBOS_PER_KEY; i++) {
candidates[i].combo = NULL;

return count;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we actually need to calculate the total number of set bits. We only care if it's 0, 1, or more.

Something like

    if (value == 0) {
        return 0; 
    }
    if ((value & (value - 1)) == 0) {
        return 1;
    }
    return 2;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I tweaked this a bit, roughly modeling what you described, to save on some cycles. Thanks!

@caksoylar
Copy link
Contributor

Would daily testing be useful in the current state of the PR?

@petejohanson
Copy link
Contributor Author

Would daily testing be useful in the current state of the PR?

I believe so, but I need to do so myself first to be 100% sure, even though tests are all green.

static uint8_t zero_one_or_more_bits(uint32_t field) {
uint8_t count = 0;
while (field && count < 2) {
field &= (field - 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am to be super nitpicky, this line gets executed one more time than is necessary with this version of the function. I also feel like this function could have inline added.

uint32_t pressed_keys_count = 0;
#define COMBO_CHILDREN_COUNT (0 DT_INST_FOREACH_CHILD(0, COMBO_ONE))

// We need at least 4 bytes to avoid alignment issues
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// We need at least 4 bytes to avoid alignment issues

Comment on lines +94 to +101
// We do some magic here to generate the `combos` array by "key position length", looping
// by key position length and on each iteration, only include entries where the `key-positions`
// length matches.
// Doing so allows our bitmasks to be "shorted key positions list first" when searching for matches.
// `20` is chosen as a reasonable limit, since the theoretical maximum number of keys you might
// reasonably press simultaneously with 10 fingers is 20 keys, two keys per finger.
static const struct combo_cfg combos[] = {
LISTIFY(20, COMBO_CONFIGS_WITH_MATCHING_POSITIONS_LEN, (), 0)};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think with the changes to filter_candidates this sorting is no longer necessary.

};

struct active_combo {
const struct combo_cfg *combo;
int16_t combo_idx;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not uint16?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants