This adds FPU sharing support with a lazy context switching algorithm.
Every thread is allowed to use FPU/SIMD registers. In fact, the compiler
may insert FPU reg accesses in anycontext to optimize even non-FP code
unless the -mgeneral-regs-only compiler flag is used, but Zephyr
currently doesn't support such a build.
It is therefore possible to do FP access in IRS as well with this patch
although IRQs are then disabled to prevent nested IRQs in such cases.
Because the thread object grows in size, some tests have to be adjusted.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Add the exception depth count to tpidrro_el0 and make it available
through the arch_exception_depth() accessor.
The IN_EL0 flag is now updated unconditionally even if userspace is
not configured. Doing otherwise made the code rather hairy and
I doubt the overhead is measurable.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Due to the use of gperf to generate hash table for kobjects,
the addresses of these kobjects cannot change during the last
few phases of linking (especially between zephyr_prebuilt.elf
and zephyr.elf). Because of this, the gperf generated data
needs to be placed at the end of memory to avoid pushing symbols
around in memory. This prevents moving these generated blocks
to earlier sections, for example, pinned data section needed
for demand paging. So create placeholders for use in
intermediate linking to reserve space for these generated blocks.
Due to uncertainty on the size of these blocks, more space is
being reserved which could result in wasted space. Though, this
retains the use of hash table for faster lookup.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
The ARM64 port is currently using SP_EL0 for everything: kernel threads,
user threads and exceptions. In addition when taking an exception the
exception code is still using the thread SP without relying on any
interrupt stack.
If from one hand this makes the context switch really quick because the
thread context is already on the thread stack so we have only to save
one register (SP) for the whole context, on the other hand the major
limitation introduced by this choice is that if for some reason the
thread SP is corrupted or pointing to some unaccessible location (for
example in case of stack overflow), the exception code is unable to
recover or even deal with it.
The usual way of dealing with this kind of problems is to use a
dedicated interrupt stack on SP_EL1 when servicing the exceptions. The
real drawback of this is that, in case of context switch, all the
context must be copied from the shared interrupt stack into a
thread-specific stack or structure, so it is really slow.
We use here an hybrid approach, sacrificing a bit of stack space for a
quicker context switch. While nothing really changes for kernel threads,
for user threads we now use the privileged stack (already present to
service syscalls) as interrupt stack.
When an exception arrives the code now switches to use SP_EL1 that for
user threads is always pointing inside the privileged portion of the
stack of the current running thread. This achieves two things: (1)
isolate exceptions and syscall code to use a stack that is isolated,
privileged and not accessible to user threads and (2) the thread SP is
not touched at all during exceptions, so it can be invalid or corrupted
without any direct consequence.
Signed-off-by: Carlo Caione <ccaione@baylibre.com>
Add the ability to define architecture specific structures, notably
the ability to extend struct _cpu with per-CPU arch-specific stuff that
can be accessed with _current_cpu->arch.* similarly to _current->arch.*
for per-thead architecture data.
This is opt-in for architectures that want to benefit from this,
otherwise empty defaults are provided. A placeholder for ARM64 is
included to show the pattern.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
We can find caller of z_arm64_mmu_init is on primary
core or not, so no need to check mpidr, just add a
function parameter.
Signed-off-by: Jiafei Pan <Jiafei.Pan@nxp.com>
Let's fully exploit tpidrro_el0 by storing in it the current CPU's
struct _cpu instance alongside the userspace mode flag bit. This
greatly simplifies the code needed to get at the cpu structure, and
this paves the way to much simpler multi cluster support, as there
is no longer the need to decode MPIDR all the time.
The same code is used in the !SMP case as there are benefits there too
such as avoiding the literal pool, and it looks cleaner.
The tpidrro_el0 value is no longer stored in the exception stack frame.
Instead, we simply restore the user mode flag based on the SPSR value.
This way, more flag bits could be used independently in the future.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
When MPU is enabled, the sections need to be 64 bytes aligned.
In the case of MMU, BSS section will be 4k aligned, because the first
variable in BSS section 'base_xlat_table' is explicitly aligned by
'__aligned(NUM_BASE_LEVEL_ENTRIES * sizeof(uint64_t))'.
However, with MPU, we do not have such a variable. So it's necessary
to fix the alignment of the BSS section in the linker.ld
Signed-off-by: Jaxson Han <jaxson.han@arm.com>
According to Armv8-R64 Spec, MPU related meta data(region base/limit)
is 64 bits. So we need to re-define MPU related data structure here.
Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Signed-off-by: Jaxson Han <jaxson.han@arm.com>
If default config ARM_MMU is set to n, samples/tests will have
compilation error. This is because the arch/arm/aarch64/arm_mmu.h
is always included.
Signed-off-by: Henry Wang <Henry.Wang@arm.com>
The structure for the arm64_cpu_init array has to carry the cache
alignment on the whole structure and not on some internal padding
to achieve the desired effect.
And align struct __esf to a 16-byte boundary which will also align
its size accordingly. This structure is allocated on the stack on
exception entry and the ABI prescribed 16-byte stack alignment
should be preserved.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Split ARM and ARM64 architectures.
Details:
- CONFIG_ARM64 is decoupled from CONFIG_ARM (not a subset anymore)
- Arch and include AArch64 files are in a dedicated directory
(arch/arm64 and include/arch/arm64)
- AArch64 boards and SoC are moved to soc/arm64 and boards/arm64
- AArch64-specific DTS files are moved to dts/arm64
- The A72 support for the bcm_vk/viper board is moved in the
boards/bcm_vk/viper directory
Signed-off-by: Carlo Caione <ccaione@baylibre.com>