| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Remove forward declaration for service routine.
Reorder code and keep hidden_def right after the respective function.
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This optimization is based on prefetching and 64bit data transfer via FPU
(only for the little endianess)
Tests shows that:
----------------------------------------
Memory bandwidth | Gain
| sh4-300 | sh4-200
----------------------------------------
512 bytes to 16KiB | ~20% | ~25%
from 32KiB to 16MiB | ~190% | ~5%
----------------------------------------
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the strcpy and strncpy assembly routines.
Benchmarks showed the following gains:
~7% for strcpy
~30% for strncpy
Note: uClibc string tests pass without any failures.
These functions have been only tested on SH4, for this reason
I've voluntarily added them within the sh4 sub-folder.
If somebody would like to test them on other SH CPUs, these can be moved
on sh common folder.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the memmove fuction for SH4.
By default, it used the generic implementation.
This new code uses the memcpy for BWD copies and implements FWD copy
when required (see comment within the code itself).
The idea behind is to get advantage of using the optimised memcpy for SH4
and use the FPU for FWD copies (for big sizes) as well.
LMBench bw_mem test showed a significant improvement on uClibc because bcopy
invokes memmove, directly.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the SH memset assembly implementation currenlty included
into the Kernel.
It also adds, only for little endian mode, the 64bit data transfer via FPU
(using single paired precision mode).
Tests shows that on SH4-300 we gain ~100% for size greater than 1KiB.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
|
|
|
|
|
|
|
|
|
|
| |
Signed-off-by: Hideo Saito <saito@densan.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
See Linux Kernel commit:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e08b954c9a140f2062649faec72514eb505f18c3
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
|
|
|
|
|
| |
Step 4
libc/string and asm implementation
|
|
(backward memcpy algorithm)
Modified libc/string/generic/Makefile.in to handle
subtarget implementations.
Fixed generic memmove code to handle backward memcpy
by using a selectable config option __ARCH_HAS_BWD_MEMCPY__
This option is on for SH4 arch
Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
|