Opened 3 years ago

Closed 3 years ago

#27292 closed enhancement (fixed)

aligned_malloc and realloc for MemoryAllocator

Reported by: gh-kliem Owned by:
Priority: major Milestone: sage-8.7
Component: cython Keywords:
Cc: Merged in:
Authors: Jonathan Kliem, Jeroen Demeyer Reviewers: Jeroen Demeyer, Jonathan Kliem, Nils Bruin
Report Upstream: N/A Work issues:
Branch: 61ad091 (Commits, GitHub, GitLab) Commit: 61ad091c14598d8ef439863c8c22585797707d25
Dependencies: Stopgaps:

Status badges

Description

Adding realloc and aligned_alloc to MemoryAllocator.

realloc is available in cysignals.memory anyway, so we might as well have it in MemoryAllocator.

aligned_malloc is useful for intrinsics, when memory up to 64-byte aligned is required. There are also similar functions in C, but its nice to have the memory taken care of by MemoryAllocator.

Change History (24)

comment:1 Changed 3 years ago by gh-kliem

  • Branch set to public/27292
  • Commit set to 99549dabac23fcade157e783a88643462669c589

New commits:

99549daadded realloc and reallocarray

comment:2 Changed 3 years ago by git

  • Commit changed from 99549dabac23fcade157e783a88643462669c589 to 470b78236ce1c03988d6bf7a333ed84bacb6c2ff

Branch pushed to git repo; I updated commit sha1. New commits:

470b782added aligned_malloc

comment:3 Changed 3 years ago by git

  • Commit changed from 470b78236ce1c03988d6bf7a333ed84bacb6c2ff to e61490aaca0feb89b9cf35985e7d28d67e67ea5f

Branch pushed to git repo; I updated commit sha1. New commits:

e61490atypo

comment:4 Changed 3 years ago by gh-kliem

  • Status changed from new to needs_review

comment:5 Changed 3 years ago by jdemeyer

This is mostly looking good. Just a few comments:

  1. Can you also add aligned_allocarray and maybe aligned_calloc for completeness?
  1. You could actually consider implementing the non-aligned functions as special case of the aligned functions.
  1. You could add a small inline helper function (in the .pxd file) to implement the "lookup the pointer for realloc" feature. In that helper function, it would also be useful to support the special case of reallocating a NULL pointer, which would then work like a normal allocation.
  1. What's the point of cdef size_t align = alignment
  1. Add a test for the pointer-not-found case of realloc
  1. Use Python 3 compatible print() syntax.
  1. Remove the superfluous blank lines at the end of docstrings and use normal " quotes for exceptions.

comment:6 Changed 3 years ago by jdemeyer

  • Reviewers set to Jeroen Demeyer
  • Status changed from needs_review to needs_work

While you're at it, could you move enlarge_if_needed to the .pxd file? It's more common to put such small inline functions in the .pxd file.

comment:7 Changed 3 years ago by git

  • Commit changed from e61490aaca0feb89b9cf35985e7d28d67e67ea5f to fc46dcbb21cfb55d61799e7124b6425b91b6465e

Branch pushed to git repo; I updated commit sha1. New commits:

fc46dcbworked in comments by Jeroen

comment:8 Changed 3 years ago by gh-kliem

  • Status changed from needs_work to needs_review

comment:9 Changed 3 years ago by git

  • Commit changed from fc46dcbb21cfb55d61799e7124b6425b91b6465e to 7839b41b83501daa3eb97e4d5f1b6e704541f0d9

Branch pushed to git repo; I updated commit sha1. New commits:

7839b41forgot the header...

comment:10 Changed 3 years ago by git

  • Commit changed from 7839b41b83501daa3eb97e4d5f1b6e704541f0d9 to 54046e1434a3bc0debb74913ca959231475e7a14

Branch pushed to git repo; I updated commit sha1. New commits:

54046e1added doctests, corrected alloced sizes

comment:11 follow-up: Changed 3 years ago by jdemeyer

Returns the index of ptr plus 1.

Why the plus 1? Isn't that making it more complicated than it should be?

comment:12 Changed 3 years ago by jdemeyer

Instead of returning an index in find_pointer, it might be more natural to return a pointer, i.e. a void**

comment:13 Changed 3 years ago by jdemeyer

  • Authors changed from Jonathan Kliem to Jonathan Kliem, Jeroen Demeyer

I'll work on this.

comment:14 in reply to: ↑ 11 Changed 3 years ago by gh-kliem

Returns the index of ptr plus 1.

Why the plus 1? Isn't that making it more complicated than it should be?

I wanted to reserve an except value. It turns out that -1 would work just as good and is more natural.

There are never max_size_t -1 objects around.

Replying to jdemeyer:

Instead of returning an index in find_pointer, it might be more natural to return a pointer, i.e. a void**

Yes, indeed. I never thought about returning the actual place where the pointer is stored.

comment:15 Changed 3 years ago by git

  • Commit changed from 54046e1434a3bc0debb74913ca959231475e7a14 to 61ad091c14598d8ef439863c8c22585797707d25

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

596c345added realloc, reallocarray, aligned_malloc
61ad091Clean up and optimize

comment:16 Changed 3 years ago by jdemeyer

After seeing the code, I made some changes to the design, mainly to simplify things. I implemented aligned_malloc() (and similar) in terms of malloc() instead of the other way around.

Needs review.

comment:17 follow-up: Changed 3 years ago by nbruin

Doesn't posix_memalign do this? Is that not available on all supported platforms? If not, shouldn't we defer to posix_memalign when it is available?

comment:18 in reply to: ↑ 17 ; follow-up: Changed 3 years ago by jdemeyer

Replying to nbruin:

Doesn't posix_memalign do this?

What I don't like about posix_memalign is that it doesn't offer variants like calloc or realloc. This branch implements also aligned_calloc and we could (but don't for now) implement also aligned_realloc.

comment:19 in reply to: ↑ 18 ; follow-ups: Changed 3 years ago by nbruin

Replying to jdemeyer:

Replying to nbruin:

Doesn't posix_memalign do this?

What I don't like about posix_memalign is that it doesn't offer variants like calloc

calloc functionality would be trivial to build on top of posix_memalign: a simply API conversion and zero out the memory block.

or realloc. This branch implements also aligned_calloc and we could (but don't for now) implement also aligned_realloc.

Indeed, realloc in some cases might be more efficient because it could happen that the memory block doesn't need to be moved. Otherwise, a "malloc-memcpy-free" would do the same, so posix_memalign could be used as a primitive as well.

It looks to me you're paying a rather hefty penalty with the current design:

  • you are wasting "alignment -1" memory for each allocation
  • you are incurring overhead for storing pointers and looking them up in order to free the memory blocks.

A good implementation of posix_memalign that can work with the internals of the system malloc should be able to avoid both in most cases.

I doubt that the occasional "realloc" that might be satisfied without relocating the memory block will be worth paying the other penalties all the time.

comment:20 in reply to: ↑ 19 Changed 3 years ago by jdemeyer

Replying to nbruin:

calloc functionality would be trivial to build on top of posix_memalign: a simply API conversion and zero out the memory block.

It's a common misconception that calloc is just malloc + memset. This is not true in the case of memory allocated with mmap (which is typically used for large allocations): the memory is already pre-zeroed by the OS so there is no need to zero it a second time. In fact, zeroing it a second time implies actually bringing in physical memory for that region, which is not needed when doing just a mmap. This is especially bad when not all the allocated is eventually used.

  • you are wasting "alignment -1" memory for each allocation

Indeed, this is a penalty to pay. But if the alignment value is relatively small to the size of the allocation (which I expect to be the typical case), then it doesn't matter so much.

  • you are incurring overhead for storing pointers and looking them up in order to free the memory blocks.

We're already doing that in MemoryAllocator before this ticket. So it's not worse than other usages of MemoryAllocator in that sense.

comment:21 in reply to: ↑ 19 Changed 3 years ago by jdemeyer

Replying to nbruin:

Otherwise, a "malloc-memcpy-free" would do the same

Similarly, for large allocations mremap would be used which doesn't need to do any copying, even if the allocation address moves.

comment:22 Changed 3 years ago by gh-kliem

  • Reviewers changed from Jeroen Demeyer to Jeroen Demeyer, Jonathan Kliem

To me, this looks fine and good to go. But I am a newbie.

comment:23 Changed 3 years ago by nbruin

  • Reviewers changed from Jeroen Demeyer, Jonathan Kliem to Jeroen Demeyer, Jonathan Kliem, Nils Bruin
  • Status changed from needs_review to positive_review

Looks OK to me too. It would seem to me the allocator could use a "free" too, but that's perhaps something for another ticket (and perhaps doesn't come up in its applications).

comment:24 Changed 3 years ago by vbraun

  • Branch changed from public/27292 to 61ad091c14598d8ef439863c8c22585797707d25
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.