This adds support for 32 double-precision registers in the context
switching of aarch32 architecture.
Signed-off-by: Huifeng Zhang <Huifeng.Zhang@arm.com>
`fpscr` is assigned from `struct __fpu_sf.fpscr` in `vfp_restore`, but it
wasn't saved into `struct __fpu_sf.fpscr` in the svc and isr handler, So
it may be a dirty value.
- Fix it by saving `fpscr` in the svc hand isr handler.
- Jump out if FPU isn't enabled
Signed-off-by: Huifeng Zhang <Huifeng.Zhang@arm.com>
The code_relocation feature creates generic section names that sometimes
conflict with already existing names.
This patch adds a '_reloc_' word to the created names to reduce the risk
of conflict.
This solves #54785.
Signed-off-by: Björn Stenberg <bjorn@haxx.se>
This adds a few line use zephyr_syscall_header() to include
headers containing syscall function prototypes.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Some RISCV platforms shipping a CLIC have a peculiar interrupt ID
ordering / mapping.
According to the "Core-Local Interrupt Controller (CLIC) RISC-V
Privileged Architecture Extensions" Version 0.9-draft at paragraph 16.1
one of these ordering recommendations is "CLIC-mode interrupt-map for
systems retaining interrupt ID compatible with CLINT mode" that is
described how:
The CLINT-mode interrupts retain their interrupt ID in CLIC mode.
[...]
The existing CLINT software interrupt bits are primarily intended for
inter-hart interrupt signaling, and so are retained for that purpose.
[...]
CLIC interrupt inputs are allocated IDs beginning at interrupt ID
17. Any fast local interrupts that would have been connected at
interrupt ID 16 and above should now be mapped into corresponding
inputs of the CLIC.
That is a very convoluted way to say that interrupts 0 to 15 are
reserved for internal use and CLIC only controls interrupts reserved for
platform use (16 up to n + 16, where n is the maximum number of
interrupts supported).
Let's now take now into consideration this situation in the DT:
clic: interrupt-controller {
...
};
device0: some-device {
interrupt-parent = <&clic>;
interrupts = <0x1>;
...
};
and in the driver for device0:
IRQ_CONNECT(DT_IRQN(node), ...);
From the hardware prospective:
(1a) device0 is using the first IRQ line of the CLIC
(2a) the interrupt ID / exception code of the `MSTATUS` register
associated to this IRQ is 17, because the IDs 0 to 15 are reserved
From the software / Zephyr prospective:
(1b) Zephyr is installing the IRQ vector into the SW ISR table (and into
the IRQ vector table for DIRECT ISRs in case of CLIC vectored mode)
at index 0x1.
(2b) Zephyr is using the interrupt ID of the `MSTATUS` register to index
into the SW ISR table (or IRQ vector table)
It's now clear how (2a) and (2b) are in contrast with each other.
To fix this problem we have to take into account the offset introduced
by the reserved interrupts. To do so we introduce
CONFIG_RISCV_RESERVED_IRQ_ISR_TABLES_OFFSET as hidden option for the
platforms to set.
This Kconfig option is used to shift the interrupt numbers when
installing the IRQ vector into the SW ISR table and/or IRQ vector table.
So for example in the previous case and using
CONFIG_RISCV_RESERVED_IRQ_ISR_TABLES_OFFSET == 16, the IRQ vector
associated to the device0 would be correctly installed at index 17 (16 +
1), matching what is reported by the `MSTATUS` register.
CONFIG_NUM_IRQS must be increased accordingly.
Signed-off-by: Carlo Caione <ccaione@baylibre.com>
Zephyr currently only supports CLINT direct mode and CLINT vectored
mode. Add support for CLIC vectored mode as well.
Signed-off-by: Carlo Caione <ccaione@baylibre.com>
Before adding support for the CLIC vectored mode, rename
CONFIG_RISCV_MTVEC_VECTORED_MODE to CONFIG_RISCV_VECTORED_MODE to be
more generic and eventually include also the CLIC vectored mode.
Signed-off-by: Carlo Caione <ccaione@baylibre.com>
With all the stacks and TSS (etc), the x86_64 arch code can only
support maximum of 4 CPUs at the moment. So add a build assert
if more CPUs are specified via CONFIG_MP_MAX_NUM_CPUS, also
overwrite the range value for CONFIG_MP_MAX_NUM_CPUS.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
We are missing setting of switch_handle for the thread which
is aborting due to exception (i.e. in case of k_panic or
__ASSERT triggered). This may cause livelock in SMP code
after a08e23f68e commit ("kernel/sched: Fix SMP
must-wait-for-switch conditions in abort/join").
Fix that.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Evgeniy Paltsev <PaltsevEvgeniy@gmail.com>
Currently the lazy fpu saving algorithm in arm64 is using the fpu_owner
pointer from the cpu structure to understand the owner of the context
in the cpu and save it in case someone different from the owner is
accessing the fpu.
The semantics for memory consistency across smp systems is quite prone
to errors and reworks on the current code might miss some barriers that
could lead to inconsistent state across cores, so to overcome the issue,
use atomics to hide the complexity and be sure that the code will behave
as intended.
While there, add some isb barriers after writes to cpacr_el1, following
the guidance of ARM ARM specs about writes on system registers.
Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
This will avoid unconditionally pulling z_riscv_switch() into the build
as it is not used, reducing the resulting binary some more.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
With Zephyr now always using `int main(void)`, there's no longer any need
for this definition. The last remaining use which gated the declaration of
_posix_zephyr_main isn't necessary as adding that declaration
unconditionally is harmless.
Signed-off-by: Keith Packard <keithp@keithp.com>
cpu_node_list does not hold the corrent mapping of cpu id and mpid when
core booting sequence does not follow the DTS cpu node sequence. This
will cause an issue that sgi cannot deliver to the right target.
Add the cpu_map array to hold the corrent mapping between cpu id and
mpid.
Signed-off-by: Jaxson Han <jaxson.han@arm.com>
Each core should init their own stack during the reset when SMP enabled,
but do not touch others. The current init results in each core starting
init the stack from the same address which will break others.
Fix the issue by setting a correct start address.
Signed-off-by: Jaxson Han <jaxson.han@arm.com>
LOG system has unalignment access instruction which will cause an
alignment exception before MPU is enabled. Remove the LOG print before
MPU is enabled to avoid this issue.
Signed-off-by: Jaxson Han <jaxson.han@arm.com>
This trick turns out also to be needed by the abort/join code.
Promote it to a more formal-looking internal API and clean up the
documentation to (hopefully) clarify the exact behavior and better
explain the need.
This is one of the more... enchanted bits of the scheduler, and while
the trick is IMHO pretty clean, it remains a big SMP footgun.
Signed-off-by: Andy Ross <andyross@google.com>
For secure EL2 to be entered the EEL2 bit in SCR_EL3 must be set. This
should only be set if Zephyr has not been configured for NS mode only,
if the device is currently in secure EL3, and if secure EL2 is supported
via the SEL2 bit in AA64PFRO_EL1. Added logic to enable EEL2 if all
conditions are met.
Signed-off-by: Chad Karaginides <quic_chadk@quicinc.com>
This reverts commit f0b458a619.
This is a pointless change that simply increases footprint.
Existing code already supports compilation without multithreading.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Per the ARMv8 architecture document, modification of the system control
register is a context-changing operation. Context-changing operations are
only guaranteed to be seen after a context synchronization event.
An ISB is a context synchronization event. One has been placed after
each SCTLR modification. Issue was found running full speed on target.
Signed-off-by: Chad Karaginides <quic_chadk@quicinc.com>
Allow builds which has CONFIG_MULTITHREADING disabled.
This is reduce code footprint which is handy for
constrained targets as bootloaders.
Signed-off-by: Marek Matej <marek.matej@espressif.com>
Allow builds which has CONFIG_MULTITHREADING disabled.
This is reduce code footprint which is handy for
constrained targets as bootloaders.
Signed-off-by: Marek Matej <marek.matej@espressif.com>
This make MCUboot build as Zephyr application.
Providing optinal 2nd stage bootloader to the
IDF bootloader, which is used by default.
This provides more flexibility when building
and loading multiple images and aims to
brings better DX to users by using the sysbuild.
MCUboot and applications has now separate
linker scripts.
Signed-off-by: Marek Matej <marek.matej@espressif.com>
Let's consider CPU1 waiting on a spinlock already taken by CPU2.
It is possible for CPU2 to invoke the FPU and trigger an FPU exception
when the FPU context for CPU2 is not live on that CPU. If the FPU context
for the thread on CPU2 is still held in CPU1's FPU then an IPI is sent
to CPU1 asking to flush its FPU to memory.
But if CPU1 is spinning on a lock already taken by CPU2, it won't see
the pending IPI as IRQs are disabled. CPU2 won't get its FPU state
restored and won't complete the required work to release the lock.
Let's prevent this deadlock scenario by looking for pending FPU IPI from
the spinlock loop using the arch_spin_relax() hook.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Let's consider CPU1 waiting on a spinlock already taken by CPU2.
It is possible for CPU2 to invoke the FPU and trigger an FPU exception
when the FPU context for CPU2 is not live on that CPU. If the FPU context
for the thread on CPU2 is still held in CPU1's FPU then an IPI is sent
to CPU1 asking to flush its FPU to memory.
But if CPU1 is spinning on a lock already taken by CPU2, it won't see
the pending IPI as IRQs are disabled. CPU2 won't get its FPU state
restored and won't complete the required work to release the lock.
Let's prevent this deadlock scenario by looking for a pending FPU IPI
from the arch_spin_relax() hook and honor it.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
The compiler is not able to emit a proper DSB operation for ARM64. Move
to the arch-specific implementation and use assembly code instead.
Signed-off-by: Carlo Caione <ccaione@baylibre.com>
Enhance for cases when call z_float_enable() with NULL thread.
Signed-off-by: Dong Wang <dong.d.wang@intel.com>
Signed-off-by: Qipeng Zha <qipeng.zha@intel.com>
This adds code to always map data TLB for VECBASE so that
we would be dealing with fewer data TLB misses during
exception handling. With VECBASE always mapped, there is
no need to pre-load anymore.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
This moves the TLB miss handling to the C exception handler.
This also allows us to handle page faults (for example,
unmapped pages) during this time as any more exceptions
handled in the C handler will not trigger the double
exception handler but the same C handler.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Instead of being able to arbitrarily set the PTEVADDR for page
table, this provides choices (currently just one). This is in
preparation to enable handling memory management exception in
C code. For that to work, we will need to pre-load the page
table address (PTEVADDR) for the memory page containing
exception code and data (containing jump addresses), and
various stacks. This is to prempt any TLB misses during handling
the level 1 interrupt code. If a TLB miss is encountered during
handling of level 1 interrupt, we will be thrown into double
exception handling code where we will get stuck in infinite
loop. However, in order to pre-load the page table entries,
PTEVADDR needs to be calculated. This requires the use of
PTEVADDR base which cannot be loaded via l32r, as we may cause
a data TLB miss. So we must be able to grab the PTEVADDR base
address strictly within code, and must be without any data
load. So changing CONFIG_XTENSA_MMU_PTEVADDR to be based on
choice so we can have pre-defined bit shift value for shift
operation. This shift value will be used in exception handling
code.
Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Add a build option to tell if memory should be mapped in cached
and uncachedr regions.
If the memory is neither in cached nor uncached region it is not double
mapped.
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
Initial support for Xtensa MMU version 3. It is using a two level page
table based on fact that the page table is in the virtual space. Only
the top level (page directory) is wired mapped in the TLB to avoid
second level page miss.
The mapped memory is completely fragmented in multiple sections, maybe
we find a better way in future.
The exception handler is where we effectively map the memory, the way it
works is:
1) SW try to access some memory address
2) The address is not mapped, so the MMU will try the auto-refill,
looking the page table
3) The page table contents is not mapped (remember, just the top-level page
is mapped)
4) An exception will be triggered, in the exception we try to read the
portion of the page table that maps the original address
5) The address is not mapped, so the MMU will try again the auto-refill.
This time though, the address is mapped by the top level page that is
properly mapped. (The top-level page maps the page table itself).
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>
Unlike tracing module mainly for debug usage, this is
to allow runtime profiling IRQ performance data, and
target to enable it in product release since platform
can choose to make it work with low weight protocol.
Enable this option and implement runtime_irq_stats()
in platform code, such as Intel ISH platform implement
with SHMI protocol to allow host profiling irq stats.
Signed-off-by: Qipeng Zha <qipeng.zha@intel.com>
Until now iterable sections APIs have been part of the toolchain
(common) headers. They are not strictly related to a toolchain, they
just rely on linker providing support for sections. Most files relied on
indirect includes to access the API, now, it is included as needed.
Signed-off-by: Gerard Marull-Paretas <gerard.marull@nordicsemi.no>
The commit 434ca63e2f introduced the
Cortex-A and Cortex-R CPU type dependency to `CONFIG_FP16` based on
the reasoning that the hardware half-precision support is only
available on them.
While it is true that the _hardware_ half-precision support is limited
to these targets, the compiler will provide the _software_ emulation
for the targets that lack the hardware half-precision support, as
mentioned in 41fd6e003c (the original
commit that introduced `CONFIG_FP16`).
Signed-off-by: Stephanos Ioannidis <stephanos.ioannidis@nordicsemi.no>
In z_xtensa_backtrace_print the parameter depth is checked for <= 0.
There is no need to check it again later, also, since the variable is
not used after the while loop we can use directly the parameter without
an additional variable.
Signed-off-by: Flavio Ceolin <flavio.ceolin@intel.com>