- Jan 03, 2018
-
-
libing4752 authored
* modified schedule_dataflow_rewrite.cc to fix losing tensor problem * modified schedule_dataflow_rewrite.cc for lint scan * modified schedule_dataflow_rewrite.cc for lint scan * using tensor's value_index to index output of stage op
-
Lianmin Zheng authored
* [CODEGEN] update codegen for vector operation * update comment, fix for metal * fix some bugs in codegen * use 'restrict' in every argument * fix * fix
-
- Jan 02, 2018
-
-
masahi authored
* add cublas support * integrate cublas to topi dense * add cublas error check * minor fix * fix lint * remove topi import from contrib unittest
-
- Dec 31, 2017
-
-
xqdan authored
* [SCHEDULE]enable partition const loop with build flag (#719) * enable partition loop with build flag * add a testcase, and modify LoopPartition related cases * * add document for split_const_loop * [IRbuild]Support automatically Name Loop Variable in IRBuilder (#719) * add idx_num in class * using typical index [i, j, k] first, then i_suffix * keep inputs names * fix lint * improve comment of name * fix lint
-
Tianqi Chen authored
-
- Dec 29, 2017
-
-
xqdan authored
* [SCHEDULE]enable partition const loop with build flag (#719) * enable partition loop with build flag * add a testcase, and modify LoopPartition related cases * * add document for split_const_loop
-
masahi authored
* use cudnn findalgo to choose the best algo * fix lint
-
kun-zh authored
* when there is no intrin func, using body for initialization. For issue 714. * Refine code per review comments, and add a test case. * Fix lint issues. * Re-organize the tensorize test cases, and add a new case for none-reset mode. * Fix a typo. * Delete the unit case because merged it into test_schedule_tensorize.py already.
-
- Dec 27, 2017
-
-
kun-zh authored
* when there is no intrin func, using body for initialization. For issue 714. * Refine code per review comments, and add a test case. * Fix lint issues.
-
Xingjian Shi authored
* support dim-0 tensor in topi ops revert transform * revert
-
masahi authored
* add target.libs to target str representation * integrate cudnn into topi cuda * append target.libs to target.options
-
- Dec 26, 2017
-
-
masahi authored
* add extern schedule for miopen * fix comment * optionally dispatch to miopen from topi * fix lint * check if current target is None * use generic dispatch for rocm conv2d * fix lint * fix workspace bug * remove blank line * remove blank line * remove blank line
-
Tianqi Chen authored
-
- Dec 25, 2017
-
-
Yuwei Hu authored
* add x86_64 target * add binary dense operator * rebase * improve schedule * remove x86 target * improve schedule
-
- Dec 24, 2017
-
-
masahi authored
* fist working miopen support * do FindFwdAlgo during build time * fix lint * update doc string * import topi after checking if rocm is enabled * add miopen namespace * fixed descriptor overwrite bug * add use_miopen option * fix lint * better miopen option handling * fix typo * fix options handling
-
Lianmin Zheng authored
* [CODEGEN] update codegen for vector operation * update comment, fix for metal
-
Tianqi Chen authored
-
- Dec 23, 2017
-
-
Cody Hao Yu authored
* Make duplicated function name checker working * Fix dependency checking problem for reducer condition (#712); add test * Fix dependency checking problem for reducer condition (#712); add test * Specify R to be computed inlined
-
Tianqi Chen authored
-
Salem Derisavi authored
-
- Dec 22, 2017
-
-
Salem Derisavi authored
During tensorize, call Simplify on algorithm and intrinsic definitions before CanonicalSimplify. This will prevent a number of false tensorize mismatches. (#718) thanks, this we can use this solution for now
-
- Dec 19, 2017
-
-
Salem Derisavi authored
* 1) removed non-determinism from CanonicalSimplify 2) added couple of testcases for CanonicalSimplify * Use IRDeepCompare instead of comparison of string representation * Give a warning (instead of fatal error) when two "ComExprEntry"s are equal
-
- Dec 17, 2017
-
-
Andrew Adams authored
-
- Dec 16, 2017
-
-
masahi authored
-
- Dec 15, 2017
-
-
Cody Hao Yu authored
-
- Dec 13, 2017
-
-
Salem Derisavi authored
* Simplify expressions early on * fixed lint errors
-
Salem Derisavi authored
* 1) Refactored some parts of the unrolling code into their own methods so we can reuse unrolling functionality in other parts of the code. E.g., to explicitly unroll loops with count of 1 when they are programmatically created. 2) Reorder based on top operator before resorting to pointers, which causes non-determinism. * Fixed lint errors
-
- Dec 11, 2017
-
-
abergeron authored
* Use long long for platforms where long is 32 bits (like windows). * Make sure scalar chars are signed. * Re-add NOLINT marker.
-
Lianmin Zheng authored
* [CODEGEN] add fp16 and fp64 enable pragma for opencl * fix style
-
- Dec 07, 2017
-
-
Lianmin Zheng authored
-
- Dec 05, 2017
-
-
alex-weaver authored
* Port build_module.py to C++ * Fix lint errors * Fix more lint errors * Fix more lint errors * Fix more lint errors * Fix build error * Implemented style fixes * Fix lint errors * Added function to construct target from string lower now returns array * Fix lint error * Implemented review changes - style & Target options -> std::vector * Fixed lint, argument alignment and added unit test * Changed test to target LLVM, fixed sign compare warnings * Reverted unit test to CUDA, changed Jenkinsfile to enable GPU for C++ tests * Slight change to Jenkinsfile * Changed build_module test from CUDA to LLVM * Added function var() to construct a Var instance. Changed implementation of LLVMEnabled() * Reverted Jenkinsfile
-
- Dec 04, 2017
-
-
Tianqi Chen authored
* [CI] Enable llvm in CPU test * fix llvm
-
Tianqi Chen authored
* Support rank-0 tensor * fix lint
-
- Dec 01, 2017
-
-
ziheng authored
* [RANDOM] Init contrib.random library * [RANDOM] Add uniform * [RANDOM] Fix lint * [RANDOM] Add comments and tests * [RANDOM] Fix lint
-
- Nov 30, 2017
-
-
Salem Derisavi authored
-
Tianqi Chen authored
* [CUDA] Enable int64 * [PYTHON] Fix rpc tutorial with opencl * OK * update
-
Yizhi Liu authored
-
solin319 authored
Change the parameter 'C' name
-
solin319 authored
In unroll_loop.cc the parameter name is "auto_max_depth", but in ir_pass.h the parameter name is "auto_min_depth"
-
- Nov 29, 2017
-
-
Tianqi Chen authored
* [RPC][JVM] Remove binary dist gradle from repo * fix header
-