{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":16143904,"defaultBranch":"master","name":"blis","ownerLogin":"flame","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2014-01-22T15:58:24.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/6494486?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1723405956.0","currentOid":""},"activityList":{"items":[{"before":null,"after":"14ae6b258bf1526130e18627cd2b362ce9108229","ref":"refs/heads/stable-apr24-cand0","pushedAt":"2024-08-11T19:52:36.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"},"commit":{"message":"Refactor the control tree and other infrastructure (#710)\n\nDetails:\n1. A \"plugin\" architecture.\n- Users are now able to register new kernels, kernel preferences, and\n blocksizes at runtime, directly from user applications.\n- Plugins can be created, configured, and built using only an installed\n version of BLIS -- no source or source code changes required.\n- Plugins support both reference and optimized kernels, as well as\n custom configuration-to-kernel-set mappings.\n- Building plugins (including reference and relevant optimized kernels)\n for enabled architectures or architecture families is automated, as is\n linking into the final library.\n- The configure script is now installed as 'configure-plugin'. In this\n mode, it can be used to initialize a plugin from a template including\n optional example code, and prepare a build system for compiling the\n plugin into a shared or static library.\n- Additional configuration files, templates, and build system components\n are also installed to '%prefix%/share/blis'.\n- The cntx_t struct now has extensible data structures for holding\n kernels, preferences, and blocksizes. These are based on a \"stack\"\n structure which contains a list of fixed-size data blocks. Adding a\n new entry (which may require allocating a new block or reallocating\n the block pointer array) requires locking, but looking up entries is\n lock-free and takes O(1) time.\n- Kernels can depend on either 1 or 2 type parameters (e.g.\n mixed-precision packing requires 2). The func2_t struct supports\n the latter, but can be implicitly cast to func_t if only \"diagonal\"\n entries are needed. The number of type parameters can be inferred from\n the kernel ID for type safety.\n- Functions have been added to register new kernels, preferences, and\n blocksizes with the global kernel structure (gks). This creates\n corresponding entries in each allocated context and returns the next\n available ID. Plugins use this API to register user kernels, although\n the user is responsible for tracking the returned IDs for later\n lookup. Setting newly-registered reference kernels, as well as\n overriding these with optimized kernels is done in exactly the same\n manner as in bli_cntx_init_ref() and bli_cntx_init_().\n\n2. Restructuring of the control and thread control trees.\n- The control tree has been substantially restructured to support more\n flexibility.\n- The \"default\" control trees for gemm (also used for\n hemm/symm/herk/her2k/syrk/syr2k/trmm/trmm3) and trsm are now\n represented as a single structure containing all necessary control\n tree nodes and parameters.\n- An API has been added to modify the default gemm/trsm control trees.\n- This same API is used by the framework and packm/gemm/trsm variants\n to access specific control tree nodes.\n- Users can alternatively create a custom control tree from scratch.\n- The blocksizes are now encoded directly in the control tree, rather\n than via loop IDs. The logic for adjusting blocksizes for certain\n operations has been moved to the control tree initialization.\n- Type information is encoded in the control tree to drive proper\n selection of packing and computational kernels provided by the user.\n- The packing microkernel now receives an opaque \"params\" struct which\n is user-definable and can be used to pass additional information\n through the call stack.\n- The auxinfo_t struct has been updated with a .params field for\n opaque user data as well as the global offsets of the current\n microtile.\n- The packm and gemm variants can be overridden by the user, and also\n receive an opaque params struct via the associated control tree\n node.\n- The structure-aware packing kernel bli_packm_struc_cxk() is no longer\n hard-coded to be called from the default packm variant, but can be\n overridden by the user. It also supports mixed-precision/mixed-domain\n natively now.\n- The thread control tree (thrinfo_t) is now created entirely up-front\n by inspecting the control tree. The required number of threads at each\n level is encoded in the control tree via loop IDs (actually a bitfield\n of loop IDs), although the ordering and number of such IDs is\n arbitrary. The logic for adjusting the number of threads at each level\n based on operation type (e.g. trmm) is now in the control tree\n initialization and expressed by combining loop IDs from multiple\n levels into a single level.\n- The mem_t object containing the pack buffer pointer has been moved\n from the control tree to the thread control tree. NOTE: **The control\n tree is now strictly const throughout the operation, and only a\n single copy is shared by all threads.**\n- The thread control tree node for packing has been changed so that\n there is no longer a \"fake\" node indicating a team of single threads.\n Instead, the number of threads and thread IDs in the \"normal\" thread\n control tree node are used. This change has also been made to the\n gemmsup thread control tree and packing variants, as well as to the\n gemmlike sandbox.\n- Parameters controlling packing (e.g. inversion of the diagonal,\n direction, schema) are not stored directly in the control tree but in\n the opaque params struct. The packing control tree node and its\n default params struct are stored together in the \"combined\"\n gemm/trsm control tree structure and initialized as a unit. Users can\n update these parameters individually or substitute a custom packm\n variant and params struct.\n- The \"target\" and \"execution\" datatypes has been removed from the obj_t\n struct and replaced by type information in the control tree.\n- The \"sub-node\" and \"sub-prenode\" of a control tree node have been\n replaced by an arbitrary number of sub-nodes accessed by index. There\n is a hard cap on the number of sub-nodes (currently 2). Sub-nodes are\n added during control tree initialization, *after*\n creation/initialization of the parent node through an updated API.\n- The level-3 thread decorator has been significantly simplified and\n directly calls bli_l3_int(). The control tree is created externally,\n and it is no longer necessary to alias matrices or set object pack\n schemas. Also, the rntm_t passed in may be NULL. Finally, family\n and scalar information is no longer needed here.\n- bli_l3_int() is now a simple inline function which extracts the next\n control tree node and variant and calls it.\n- bli_*_front() have been removed and inlined into the expert object\n API with significant simplification.\n- 1m (or other induced method) no longer uses an alternative cntx_t.\n- The .pack_fn/.ker_fn pointers and associated params fields on the\n obj_t were removed in favor of the present solution.\n\n3. Overhaul of variable substitution in configure script.\n- The configure script has been somewhat re-written to use a\n centralized mechanism for substituting variables into build system and\n other configuration files.\n- All substitution variables go through the same pathway now, which\n necessitated some variable naming changes for variables which were\n named the same in e.g. Makefile and bli_config.h but with\n different definitions.\n- CC and CXX variables can now contain spaces, e.g. 'g++ -std=c++17'.\n This provides better support for integration with build tooling such\n as autotools.\n\n4. Overhaul of packing kernels.\n- Previously there were two packing kernels referenced in the cntx_t\n structure for MRxk and NRxk shaped micropanels, respectively. These\n have now been merged into one kernel which is responsible for packing\n any dense rectangular portion of either A or B.\n- The packing kernel now receives information about the register\n blocksize (cdim_max) and duplication factor (the \"broadcast-B\"\n format, although this can also apply to the A matrix).\n- The structure-aware packing kernel (bli_packm_struc_cxk(), which is\n now user-overridable) also receives global offsets of the current\n micropanel within A or B.\n- Explicit kernels for packing the diagonal blocks of\n triangular/symmetric/Hermitian matrices have been added to the\n cntx_t. This means that the bli_packm_struc_ckx() \"kernel\" no longer\n needs to directly touch data (except to zero out some regions).\n- bli_packm_struc_cxk() has also been updated to work only in terms of\n fundamental elements (i.e., real datatypes) when computing offsets and\n when zeroing data, which greatly simplifies mixed-domain/1m packing.\n- bli_packm_scalar() has been updated to better support complex scalars\n in mixed-domain operations.\n- Pack schemas for PACKED_ROW_PANELS* and PACKED_COL_PANELS* have\n been merged into simply PACKED_PANELS*. This reflects the merging of\n the packing kernels into a single generic kernel. There were only a\n very few places which needed the row/column information and this is\n now supplied by alternative means.\n- Packing variants always behave \"as if\" the A matrix were being packed\n (i.e. the code assumes packing column-stored row panels). Packing of B\n is handled by applying an implicit or explicit transpose before\n packing. This change also applies to gemmsup.\n\n5. Improved MD/MP support.\n- All level-3 operations (except trsm) now support full\n mixed-domain/mixed-precision operation.\n- Explicit 1m packing kernels have been added in the cntx_t.\n- An explicit 1m microkernel wrapper has been added to the cntx_t.\n- An extra packing kernel for the \"ro\" format has been added, along with\n the pack_t enumeration value. This supports the packing for\n real*complex -> real, including potential scaling by a complex alpha,\n support for structured matrices, etc.\n- Extra microkernel wrappers for mixed-domain operations have been added\n to support the 'ccr' (and by extension, 'crc'), 'rcc', and 'crr'\n cases. Notably this includes full support for general stride storage\n and complex alpha/beta.\n- Packing kernels and gemm microkernels are now \"templated\" based on two\n type parameters rather than one. For packing this allows direct\n optimization of mixed-precision kernels, and for gemm microkernels\n this allows direct optimization of mixed-precision without writing to\n a temporary buffer. Reference packing kernels are directly\n instantiated for all mixes of precisions, while by default\n mixed-precision gemm microkernels are supported via a microkernel\n wrapper. The \"old\" way of specifying optimized kernels using a single\n type parameter works unchanged.\n- alpha and beta are typecast appropriately to the computational or\n output datatype, respectively, and **always** to the complex domain.\n Scalar typecasting has also been added to gemmsup for safety.\n- The gemm macrokernel doesn't have to do any typecasting anymore, as a\n microkernel wrapper or optimized mixed-precision/mixed-domain kernel\n now handles this.\n- 1m and mixed-domain operations now always use a microkernel wrapper,\n rather than adjusting parameters in the gemm macrokernel.\n- The gemmt macrokernel **does** still have to handle explicit\n write-back of microtiles which intersect the diagonal, although\n typecasting has already been performed.\n- The gemmt_x_ker_var2(), trmm_xx_ker_var2(), and trsm_xx_ker_var2()\n functions have been removed. The appropriate macrokernel pointer is\n selected during control tree initialization.\n- Real domain MR/NR are checked for even-ness based on the gemm\n microkernel's row preference in order to guarantee proper 1m and\n mixed-domain operation.\n- Full range of mixed-domain/mixed-precision functionality tested in the\n testsuite ('input.*.mixed').\n\n6. Other changes:\n- The build system has been updated to support C++ source files\n throughout the framework. While the intent is not to add such files to\n BLIS itself, this supports plugins written in C++.\n- Many instances of configuration-specific code have been simplified by\n introducing an INSERT_GENTCONF macro which instantiates a block of\n code for each enabled sub-configuration. The ConfigurationHowTo.md\n document has been updated accordingly.\n- PASTEMAC?/PASTECH?/PASTEF77? have been removed in favor of\n variadic macros which accept any number of arguments (up to a\n reasonable limit).\n- The INSERT_GENTFUNC* macros have been updated to clean up\n mixed-precision and mixed-domain instantiations.\n- bli_align_dim_to_mult() has been updated to support rounding either up\n or down based on a flag.\n- Checking for empty matrices and other early exits (level-3 only) has\n been consolidated into a single utility function.\n- The auxinfo_t struct is always passed as const.\n- The new function bli_obj_alias_submatrix() aliases a matrix while also\n resetting the root to NULL, offsets to zero (while adjusting the\n buffer), and applying any implicit transpose.\n- Level-3 pruning functions now only check matrix structure to see what\n to do, not the operation family.\n- gemmsup packing has been updated to use the \"normal\" pack buffer\n allocation routines.\n- Remove duplicate checks for early return from gemmsup handler.\n- bli_determine_blocksize() has been significantly simplified.\n- Partitioning packed panels is no longer allowed.\n- Added bli_xxsame macros.\n- Automated the calculation of info bit shifts and masks based on\n predefined bit sizes for various flags. This greatly simplifies\n reordering, adding, or removing flags from the info/info2 bitfields.\n- Moved more BLIS_NUM_* macros into the corresponding enums as the\n last entry so that the value is automatically computed.\n- Better const-correctness in some level0 scalar macros.\n- Better mixed-precision support in some level0 scalar macros.\n- Added a bli_axpbys_mxn() macro.\n- bli_thread_range_sub() takes explicit thread ID and number of threads\n rather than a thrinfo_t node.\n- \"De-templated\" BLIS gemmlike sandbox (specifically, bls_gemm_bp_var1()\n and bls_packm_var1()).\n- Combined bls_l3_packm_[ab]() into one function with thin wrappers.\n- Deleted bls_packm_var[23]().\n- Add a \"termination tag\" to the testsuite output so that\n 'make check-blis' can accurately check for successful completion.\n- Add a new function to centrally compute FLOPs for level-3 operations\n in the testsuite.\n\n- (cherry picked from a49238e6141c96a41aa3c2a4adb0b0663d0b4968)","shortMessageHtmlLink":"Refactor the control tree and other infrastructure (#710)"}},{"before":"83f096ca9a86fa8b0727332113549a4f4595ffff","after":null,"ref":"refs/heads/dont_defer_flat_cblas","pushedAt":"2024-08-08T19:41:34.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"}},{"before":"a822cb2e22b7ac0c6aec4d477f93301ccf65a296","after":"8d9be878b1a59aba401fd0d7b1b24c34526f0e81","ref":"refs/heads/master","pushedAt":"2024-08-08T19:41:30.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"},"commit":{"message":"Flatten cblas.h immediately after blis.h. (#819)\n\nDetails:\r\n- Previously, if the user enabled CBLAS via 'configure --enable-cblas'\r\n and then ran 'make', the flattened blis.h header file would be created\r\n immediately, but the flattened cblas.h header file would not be\r\n created until 'make install' was run. This was happening because\r\n nothing in the BLIS build process (except installation) depended on\r\n the flattened cblas.h (whereas *everything* depends on the flattened\r\n blis.h, and therefore it was being created first). This behavior can\r\n be confusing to application developers who could reasonably expect\r\n that the flattened cblas.h header would be available (to inspect or\r\n use) prior to running 'make install'.\r\n- This commit fixes the aforementioned issue by (1) adding cblas.h (if\r\n CBLAS is enabled) as a dependency to all of the build rules for core\r\n framework object files, and (2) making the flattened blis.h a\r\n prerequisite for flattening cblas.h. The upshot is that (1) ensures\r\n that the flattened cblas.h is created around the the same time that\r\n the flattened blis.h is created, and (2) ensures that the two headers\r\n are flattened sequentially (first blis.h and then cblas.h) even when\r\n using 'make -j[n]', which ensures that the output of the two processes\r\n do not comingle.\r\n- Thanks to Jeff Diamond for reporting this issue.","shortMessageHtmlLink":"Flatten cblas.h immediately after blis.h. (#819)"}},{"before":"b66b48fb014f6ca34731435ff0555a647bb1d126","after":null,"ref":"refs/heads/sup_rd_s1x16n_fix","pushedAt":"2024-08-08T18:34:41.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"}},{"before":"8820f8f91efd32e38e2995e73323656ef767bbd8","after":"a822cb2e22b7ac0c6aec4d477f93301ccf65a296","ref":"refs/heads/master","pushedAt":"2024-08-08T18:34:37.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"},"commit":{"message":"Fixed out-of-bounds read bug in sup haswell ukr. (#824)\n\nDetails:\r\n- Fixed a bug in the bli_sgemmsup_rd_haswell_asm_1x16n() millikernel.\r\n The kernel was erroneously performing an out-of-bounds read whenever\r\n the singleton edge case loop executed (that is, whenever the k\r\n dimension of the millikernel problem was not a multiple of 8). This\r\n OOB error was the result of a copy-paste bug; when developing the\r\n s1x16n function, I started from a copy of the s2x16n function, but\r\n then failed to delete the instruction that reads the second element\r\n of A in the code that handles the PR loop's edge case. Thanks to\r\n @j-bm for reporting this bug in Issue #821 and helping narrow down\r\n the cause to the rax register.\r\n- CREDITS file update.","shortMessageHtmlLink":"Fixed out-of-bounds read bug in sup haswell ukr. (#824)"}},{"before":null,"after":"b66b48fb014f6ca34731435ff0555a647bb1d126","ref":"refs/heads/sup_rd_s1x16n_fix","pushedAt":"2024-08-08T00:36:08.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"},"commit":{"message":"Fixed out-of-bounds read bug in sup haswell ukr.\n\nDetails:\n- Fixed a bug in the bli_sgemmsup_rd_haswell_asm_1x16n() millikernel.\n The kernel was erroneously performing an out-of-bounds read whenever\n the singleton edge case loop executed (that is, whenever the k\n dimension of the millikernel problem was not a multiple of 8). This\n OOB error was the result of a copy-paste bug; when developing the\n s1x16n function, I started from a copy of the s2x16n function, but\n then failed to delete the instruction that reads the second element\n of A in the code that handles the PR loop's edge case. Thanks to\n @j-bm for reporting this bug in Issue #821 and helping narrow down\n the cause to the rax register.","shortMessageHtmlLink":"Fixed out-of-bounds read bug in sup haswell ukr."}},{"before":"3899daecdc3172a45441d1dc7dcca2681a409693","after":"857d8628cc3008cb1b0a2e885b75cb3a3111de4a","ref":"refs/heads/omit_symbols_option","pushedAt":"2024-08-03T20:39:22.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"},"commit":{"message":"Fixed typos; leave lsame/xerbla prototypes enabled.\n\nDetails:\n- Leave lsame_() and xerbla_() prototypes enabled even when their\n respective symbols are omitted from the library.\n- Fixed copy-and-paste bug.\n- Fixed typos in the #define directives.","shortMessageHtmlLink":"Fixed typos; leave lsame/xerbla prototypes enabled."}},{"before":null,"after":"3899daecdc3172a45441d1dc7dcca2681a409693","ref":"refs/heads/omit_symbols_option","pushedAt":"2024-08-01T22:43:28.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"},"commit":{"message":"Implemented --omit-symbols=LIST configure option.\n\nDetails:\n- Added a new option to 'configure' that allows the user to specify a\n list of symbols to omit from the library. The format of the option is\n --omit-symbols=LIST where LIST is a comma-separated list of symbol\n names (excluding any trailing underscore). This list is parsed into\n a list of #define directives that causes the relevant parts of BLIS\n to be ignored (or not). As such, the nature of this option is to only\n support omitting symbols which have been pre-identified as potential\n troublemakers when linking BLIS with other libraries such as LAPACK\n or ScaLAPACK. (This list may grow in the future as additional symbols\n are identified.)\n- Re-implemented the --enable-scalapack-compat configure option to\n utilize the underlying --omit-symbols=LIST infrastructure.\n- Implemented an --enable-lapack-compat option, which omits all of the\n known problematic symbols currently supported for omission.\n- This commit addresses Issue #816. Thanks to Timo Betcke for bringing\n it to our attention and to Devin Matthews for his advice and for\n his initial implementation of --enable-scalapack-compat (PR #813).\n- CREDITS file update.","shortMessageHtmlLink":"Implemented --omit-symbols=LIST configure option."}},{"before":null,"after":"60c048228b5bfd13a99759bd213f23cb279cf9cd","ref":"refs/heads/plugin-doc","pushedAt":"2024-07-17T19:57:32.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"devinamatthews","name":"Devin Matthews","path":"/devinamatthews","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5246113?s=80&v=4"},"commit":{"message":"WIP on plugin documentation.","shortMessageHtmlLink":"WIP on plugin documentation."}},{"before":null,"after":"83f096ca9a86fa8b0727332113549a4f4595ffff","ref":"refs/heads/dont_defer_flat_cblas","pushedAt":"2024-07-10T21:55:07.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"},"commit":{"message":"Flatten cblas.h immediately after blis.h.\n\nDetails:\n- Previously, if the user enabled CBLAS via 'configure --enable-cblas'\n and then ran 'make', the flattened blis.h header file would be created\n immediately, but the flattened cblas.h header file would not be\n created until 'make install' was run. This was happening because\n nothing in the BLIS build process (except installation) depended on\n the flattened cblas.h (whereas *everything* depends on the flattened\n blis.h, and therefore it was being created first). This behavior can\n be confusing to application developers who could reasonably expect\n that the flattened cblas.h header would be available (to inspect or\n use) prior to running 'make install'.\n- This commit fixes the aforementioned issue by (1) adding cblas.h (if\n CBLAS is enabled) as a dependency to all of the build rules for core\n framework object files, and (2) making the flattened blis.h a\n prerequisite for flattening cblas.h. The upshot is that (1) ensures\n that the flattened cblas.h is created around the the same time that\n the flattened blis.h is created, and (2) ensures that the two headers\n are flattened sequentially (first blis.h and then cblas.h) even when\n using 'make -j[n]', which ensures that the output of the two processes\n do not comingle.\n- Thanks to Jeff Diamond for reporting this issue.","shortMessageHtmlLink":"Flatten cblas.h immediately after blis.h."}},{"before":null,"after":"537eb30cb263cf983f7bd2897d51978a318f530c","ref":"refs/heads/fix-piledriver-again","pushedAt":"2024-07-09T18:41:43.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"devinamatthews","name":"Devin Matthews","path":"/devinamatthews","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5246113?s=80&v=4"},"commit":{"message":"Run full \"make check\" for SDE tests.","shortMessageHtmlLink":"Run full \"make check\" for SDE tests."}},{"before":"ed57f0fc2a48543146205d6d5503822a95140f0b","after":"acb2896d5599328b1f24abe4c497b362d86e8f23","ref":"refs/heads/stable","pushedAt":"2024-07-05T22:00:15.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"},"commit":{"message":"Added configure option to disable level4 ops.\n\nDetails:\n- Implemented a new configure option, --disable-level4 (with\n --enable-level4 being the current default), which disables the\n compilation of all code within frame/4. On underpowered CPUs this\n can cut the compilation time by 25%.\n- Note that when configuring with --disable-level4, BLIS will define\n static inline function equivalents for some level-4 object APIs.\n Without this accommodation, the testsuite would be left with\n unresolved symbols.","shortMessageHtmlLink":"Added configure option to disable level4 ops."}},{"before":"1a6772feb1749faa2b42d30ae720738087ef6967","after":"e6f7d80c700a253e7c52a74425eb3bef00bcb3fb","ref":"refs/heads/r1.x","pushedAt":"2024-06-26T21:19:10.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"},"commit":{"message":"Fix a bug in the piledriver microkernels. (#814)\n\nDetails:\n- At some point, the piledriver (and bulldozer and excavator)\n microkernel tests via SDE had been removed from Travis CI testing.\n This PR re-enables them.\n- A bug in the piledriver complex gemm microkernels has also been\n fixed. The beta*C product was not being correctly added to the A*B\n product before writing back out to memory.\n- Fixes #811.\n- (cherry picked from commit 31ecf820b9eb3368ad907ae6b192bf7397ebc92c)","shortMessageHtmlLink":"Fix a bug in the piledriver microkernels. (#814)"}},{"before":"b05fdc3919c617a050c5ca2af102a4a99dbd4133","after":null,"ref":"refs/heads/scalapack_fix_tweaks","pushedAt":"2024-06-26T03:56:27.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"}},{"before":"31ecf820b9eb3368ad907ae6b192bf7397ebc92c","after":"8820f8f91efd32e38e2995e73323656ef767bbd8","ref":"refs/heads/master","pushedAt":"2024-06-26T03:56:23.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"},"commit":{"message":"Fixed typo in 4158930; variable renames. (#815)\n\nDetails:\r\n- Fixed a typo in the \"./configure --help\" output for the ScaLAPACK\r\n compatibility option implemented in 4158930.\r\n- Trivial variable renames.","shortMessageHtmlLink":"Fixed typo in 4158930; variable renames. (#815)"}},{"before":null,"after":"b05fdc3919c617a050c5ca2af102a4a99dbd4133","ref":"refs/heads/scalapack_fix_tweaks","pushedAt":"2024-06-23T23:16:16.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"},"commit":{"message":"Fixed typo in 4158930; variable renames.\n\nDetails:\n- Fixed a typo in the \"./configure --help\" output for the ScaLAPACK\n compatibility option implemented in 4158930.\n- Trivial variable renames.","shortMessageHtmlLink":"Fixed typo in 4158930; variable renames."}},{"before":"ef879895fc30e576ae39db947b38127e94a373ce","after":null,"ref":"refs/heads/fix-piledriver","pushedAt":"2024-06-20T23:23:28.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"devinamatthews","name":"Devin Matthews","path":"/devinamatthews","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5246113?s=80&v=4"}},{"before":"415893066e966159799d96166cadcf9bb5535b1c","after":"31ecf820b9eb3368ad907ae6b192bf7397ebc92c","ref":"refs/heads/master","pushedAt":"2024-06-20T23:23:24.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"devinamatthews","name":"Devin Matthews","path":"/devinamatthews","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5246113?s=80&v=4"},"commit":{"message":"Fix a bug in the piledriver microkernels. (#814)\n\nDetails:\r\n- At some point, the piledriver (and bulldozer and excavator) \r\n microkernel tests via SDE had been removed from Travis CI testing. \r\n This PR re-enables them.\r\n- A bug in the piledriver complex gemm microkernels has also been \r\n fixed. The `beta*C` product was not being correctly added to the `A*B` \r\n product before writing back out to memory.\r\n- Fixes #811.","shortMessageHtmlLink":"Fix a bug in the piledriver microkernels. (#814)"}},{"before":"06eb693c78c80ea6c184bebd03cb6ff7f5ee56c0","after":"ef879895fc30e576ae39db947b38127e94a373ce","ref":"refs/heads/fix-piledriver","pushedAt":"2024-06-20T01:40:36.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"devinamatthews","name":"Devin Matthews","path":"/devinamatthews","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5246113?s=80&v=4"},"commit":{"message":"Fix bug in piledriver `[cz]gemm` microkernels.","shortMessageHtmlLink":"Fix bug in piledriver [cz]gemm microkernels."}},{"before":"11a4ef51d5b777fae498c3fe256b32a34ed9e793","after":"06eb693c78c80ea6c184bebd03cb6ff7f5ee56c0","ref":"refs/heads/fix-piledriver","pushedAt":"2024-06-20T01:25:55.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"devinamatthews","name":"Devin Matthews","path":"/devinamatthews","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5246113?s=80&v=4"},"commit":{"message":"Re-enable bulldozer/piledriver/excavator in SDE tests.","shortMessageHtmlLink":"Re-enable bulldozer/piledriver/excavator in SDE tests."}},{"before":null,"after":"11a4ef51d5b777fae498c3fe256b32a34ed9e793","ref":"refs/heads/fix-piledriver","pushedAt":"2024-06-20T01:21:01.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"devinamatthews","name":"Devin Matthews","path":"/devinamatthews","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5246113?s=80&v=4"},"commit":{"message":"Re-enable bulldozer/piledriver/excavator in SDE tests.","shortMessageHtmlLink":"Re-enable bulldozer/piledriver/excavator in SDE tests."}},{"before":"a8f03c095839d56e0108afe08f250dcf63d67180","after":null,"ref":"refs/heads/scalapack-compat","pushedAt":"2024-06-19T03:03:38.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"devinamatthews","name":"Devin Matthews","path":"/devinamatthews","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5246113?s=80&v=4"}},{"before":"5cbec6503de335b3b63fa5d4f388fddd3aff2b61","after":"415893066e966159799d96166cadcf9bb5535b1c","ref":"refs/heads/master","pushedAt":"2024-06-19T03:03:32.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"devinamatthews","name":"Devin Matthews","path":"/devinamatthews","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5246113?s=80&v=4"},"commit":{"message":"Add ScaLAPACK compatibility mode. (#813)\n\nDetails:\r\n- Add configure options `--enable-scalapack-compat` and `--disabled-scalapack-compat`\r\n (default disabled).\r\n- Add a macro `BLIS_{ENABLE,DISABLE}_SCALAPACK_COMPAT` to bli_config.h.\r\n- This option and macro control any changes to the API necessary to maintain\r\n compatibility with ScaLAPACK. Currently, this only means disabling the complex\r\n versions of `syr`, `syr2`, and `symv`. In the future, other changes could be\r\n controlled by the same flag.\r\n- Complex `syr2` wasn't enabled at the same time that complex `syr` and `symv` were.\r\n This is now corrected.","shortMessageHtmlLink":"Add ScaLAPACK compatibility mode. (#813)"}},{"before":null,"after":"a8f03c095839d56e0108afe08f250dcf63d67180","ref":"refs/heads/scalapack-compat","pushedAt":"2024-06-18T23:43:49.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"devinamatthews","name":"Devin Matthews","path":"/devinamatthews","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5246113?s=80&v=4"},"commit":{"message":"Add ScaLAPACK compatibility mode.\n\nDetails:\n- Add configure options `--enable-scalapack-compat` and `--disabled-scalapack-compat`\n (default disabled).\n- Add a macro `BLIS_{ENABLE,DISABLE}_SCALAPACK_COMPAT` to bli_config.h.\n- This option and macro control any changes to the API necessary to maintain\n compatibility with ScaLAPACK. Currently, this only means disabling the complex\n versions of `syr`, `syr2`, and `symv`. In the future, other changes could be\n controlled by the same flag.\n- Complex `syr2` wasn't enabled at the same time that complex `syr` and `symv` were.\n This is now corrected.","shortMessageHtmlLink":"Add ScaLAPACK compatibility mode."}},{"before":"9730d6cc0a667723d112371f3f3d6b00af216490","after":null,"ref":"refs/heads/stable-mar24-cand0","pushedAt":"2024-06-11T18:24:28.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"}},{"before":"2307a4be4555ff1192f908e047402a09092371ba","after":null,"ref":"refs/heads/stable-feb19-cand0","pushedAt":"2024-06-11T18:24:28.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"}},{"before":"f7ce54a252028483e4c6af619015eb22063d5541","after":null,"ref":"refs/heads/1.0-rc0","pushedAt":"2024-06-11T18:19:46.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"}},{"before":null,"after":"f7ce54a252028483e4c6af619015eb22063d5541","ref":"refs/heads/r1.0-rc0","pushedAt":"2024-06-11T18:19:41.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"},"commit":{"message":"CREDITS file update.","shortMessageHtmlLink":"CREDITS file update."}},{"before":"968c9be404763b48e72f218598c7edd2bd571780","after":null,"ref":"refs/heads/1.0-rc1","pushedAt":"2024-06-11T18:19:27.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"}},{"before":null,"after":"968c9be404763b48e72f218598c7edd2bd571780","ref":"refs/heads/r1.0-rc1","pushedAt":"2024-06-11T18:19:23.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"fgvanzee","name":"Field G. Van Zee","path":"/fgvanzee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/5487570?s=80&v=4"},"commit":{"message":"Include bli_config.h before bli_system.h in cblas.h. (#789)\n\nDetails:\n- Previously, in cblas.h, bli_config.h was being #included *after*\n bli_system.h, which meant that the BLIS_ENABLE_SYSTEM macro was\n never defined in time for proper OS detection. This bug only\n affected cblas.h -- blis.h had been correctly #including\n bli_config.h before bli_system.h since fb93d24. Thanks to\n Edward Smyth for reporting this bug and suggesting the fix.\n- (cherry picked from commit a72e4569f2a03cc3578c019bf7ce25491a44137d)","shortMessageHtmlLink":"Include bli_config.h before bli_system.h in cblas.h. (#789)"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAEl4fVagA","startCursor":null,"endCursor":null}},"title":"Activity ยท flame/blis"}