Assembly x64 Intro - SSE2 4x4D Transpose
来源:互联网 发布:fifaonline3古利特数据 编辑:程序博客网 时间:2024/05/16 18:04
; in: xmm1, xmm2, xmm3, xmm4, xmm5 pOut: xmm1, xmm4, xmm5, mm3
%macro SSE2_Trans4x4D 5
SSE2_XSawp dq, %1, %2, %5
SSE2_XSawp dq, %3, %4, %2
SSE2_XSawp qdq, %1, %3, %4
SSE2_XSawp qdq, %5, %2, %3
%endmacro
;for TRANSPOSE
%macro SSE2_XSawp 4
movdqa %4, %2
punpckl%1 %2, %3
punpckh%1 %4, %3
%endmacro
SSE2_Trans4x4D xmm4, xmm2, xmm1, xmm3, xmm5 ; pOut: xmm4,xmm3,xmm5,xmm1
类似 MMX_Trans4x4W, MMX_Trans4x4W操作的是16bit的字, 而SSE2_Trans4x4D 操作的是双字。
0 0
- Assembly x64 Intro - SSE2 4x4D Transpose
- Assembly x64 Intro - SSE2 2x4x4W Transpose
- Assembly x64 Intro - SSE2 4x8 Load
- Assembly x64 Intro - SSE2 4x8 Store
- Assembly x64 Intro - SSE2 Hadamard 4 DC
- Assembly x64 Intro - MMX 4x4W Transpose
- Assembly x64 Intro - SSE2 Copy16Times
- Assembly x64 Intro - SSE2 Copy8Times
- Assembly x64 Intro - SSE2 DCT
- Assembly x64 Intro - SSE2 IDCT
- Assembly x64 Intro - SSE2 Diff 4x8 Store
- Assembly x64 Intro - SSE2 8DC Load
- Assembly x64 Intro - SSE2 Diff 8 Load
- Assembly x64 Intro - SSE2 IDCT I16X16 DC
- Assembly x64 Intro - Arrays
- Assembly x64 Intro - Nasm Example
- Assembly x64 Intro - Nasm Syntax
- Assembly x64 Intro - Arith Operate
- 数据库读写分离的性能分析
- 利用mmSeg4j分词实现网页文本倾向性分析
- Win7/Win10下的进程操作
- JS 跳转到指定链接
- linux下QT初试
- Assembly x64 Intro - SSE2 4x4D Transpose
- Title_消息加解密(java版)
- OC学习篇之---Foundation框架中的其他类(NSNumber,NSDate,NSExcetion)
- 属性动画Animator学习
- Ming Rpc
- 性能评估
- ios nsstring 字符串包含引号 处理方法
- 23种设计模式C++实例之外观模式
- Agile Java自学笔记(一)