Skip to content
Snippets Groups Projects
  1. Feb 04, 2018
    • libing4752's avatar
      enhance pragma to support single point copy (#863) · fbb472b8
      libing4752 authored
      * modified schedule_dataflow_rewrite.cc to fix losing tensor problem
      
      * modified schedule_dataflow_rewrite.cc for lint scan
      
      * modified schedule_dataflow_rewrite.cc for lint scan
      
      * using tensor's value_index to index output of stage op
      
      * repare address offset for different kinds of dtype
      
      * bc
      
      * aaa
      
      * aaaaa
      
      * repare address for different dtypes
      
      * remove nonsense files
      
      * add whitespace of line 581
      
      * use base alloc elem_type
      
      * enhance the testcast of basic buffer is 64bits,32bits,16bits,8bits
      
      * use extends[0]->type() as dtype of offset
      
      * clear program writes
      
      * enhance inject_copy_intin to support of pragma stmt with no loops
      
      * fix cpplint errors
      
      * fix cpplint error of !
      
      * enhance detectLinearEquation to support with no loop vars
      
      * fix cpplint errors
      fbb472b8
  2. Feb 03, 2018
  3. Feb 02, 2018
  4. Jan 31, 2018
  5. Jan 30, 2018
  6. Jan 29, 2018
  7. Jan 28, 2018
    • alex-weaver's avatar
      Porting schedules (except convolutions) to C++ (#763) · f280f23a
      alex-weaver authored
      * Ported injective schedules to C++. Added some elementwise ops.
      
      * Fix lint errors
      
      * Added reduction ops and schedules
      
      * Fix lint errors
      
      * Fix lint errors
      
      * Fix lint errors
      
      * Added transform ops
      
      * Fix lint errors
      
      * Fix lint errors
      
      * Added softmax, log_softmax, leaky_relu and flatten ops.
      Fixed issue where TVM_DECLARE_INTRIN_UNARY used the PureExtern flag
      instead of PureIntrinsic.
      Added softmax CUDA schedule.
      
      * Fix lint
      
      * Fix lint
      
      * Added binary_dense, batch_norm_inference, dense, dilate, scale_shift_*,
      global_pool and pool ops.
      Extended pad to allow specifying pad_value.
      Fixed issue where pad would throw if padding was zero in all dimensions.
      
      * Fix lint
      
      * Fix lint
      
      * Added CUDA schedules for dense, pool and global_pool
      
      * Added extern schedules for generic and CUDA
      
      * Fix lint
      
      * Added x86 binary schedules
      
      * Fix lint
      
      * Added rocm dense schedule. Added rocBLAS and cuBLAS support to dense ops
      
      * Added pow ops. Added x86 default and injective schedules
      
      * Fix lint
      
      * Fix lint
      
      * Fix lint
      
      * Fix lint
      
      * Fix lint
      
      * Fix indent
      
      * Removed schedules directory
      
      * Changed left_shift, right_shift to operators. Changed pad_value in pad() to remove pointer usage
      
      * Fixed usage of pad in nn/pooling.h. Fixed declaration of operator>>
      
      * Fixed comments for shift operators
      
      * Added comments to utility functions
      
      * Added TOPI C++ library, exporting broadcast_add op
      
      * Fix lint
      
      * Share libinfo.py with TVM
      
      * Fix lint
      
      * Add other broadcast ops
      
      * Fix lint
      
      * Fix imports in topi
      
      * Fix lib names
      
      * Fixed build issue where windows builds don't apply correct definitions
      
      * Removed TVM_EXPORTS from topi library
      
      * Attempted CI build fix
      
      * Add topi lib to tvm_multilib
      
      * Fix Jenkinsfile
      
      * Added TOPI build target to Makefile
      
      * Fix nn op namespaces.
      
      * Fix lint
      
      * Renamed TOPI lib to libtvm_topi
      
      * Removed _ffi/base.py
      
      * Remove _ffi from topi, now shared with tvm.
      
      * Make libtvm_topi loading optional
      
      * Fix compiler warnings
      
      * Fix lint
      
      * Fix lint
      
      * Fix lint
      
      * Fix build error by making new libs argument to Target optional
      
      * Added C++ Target type interop. Added registration of remaining C++ ops and schedules. Added test of broadcast ops
      
      * Fix lint
      
      * Fix lint
      
      * Fix compile error
      
      * Fix compiler warnings
      
      * Fix compiler warnings
      
      * Fixed int vector interop. Fixed argmin incorrectly invoking argmax. Fixed corner case in default schedules of attempting to fuse 0 length axes. Added tests for reduce ops.
      
      * Refactored reduce builders
      
      * Fixed typos in topi.cc. Added basic test.
      
      * Fixed padding size error. Added dense, dilate, pooling tests
      
      * Fixed issue where clip would output a different dtype to the input. Added split_sections op to cover the other mode of the python split op. Added tests.
      
      * Changed extension type numbers to avoid clash with NNVM
      
      * Fix lint
      
      * Fix compiler warnings
      
      * Removed use of std::vector from the public TOPI API
      
      * Fix lint
      
      * Add TOPI C++ tests to CI
      
      * Fixed detail namespacing. Improved comments.
      f280f23a
    • Zhixun Tan's avatar
      944de73b
    • Siva's avatar
  8. Jan 27, 2018
    • Tianqi Chen's avatar
    • kun-zh's avatar
      support using pointer with an original offset (#826) · 293dac39
      kun-zh authored
      * when there is no intrin func, using body for initialization. For issue 714.
      
      * Refine code per review comments, and add a test case.
      
      * Fix lint issues.
      
      * Re-organize the tensorize test cases, and add a new case for none-reset
      mode.
      
      * Fix a typo.
      
      * Delete the unit case because merged it into test_schedule_tensorize.py already.
      
      * always use new tensor in its stage when rewrite for cache read
      
      * revert previous changes to sync up with master
      
      * support using the ptr with an original offset
      
      * update test case and fix CI error
      293dac39
  9. Jan 25, 2018
  10. Jan 24, 2018
    • Tianqi Chen's avatar
      37734045
    • libing4752's avatar
      [PASS] enhance storage_rewrite to support different dtypes for unified buffer (#805) · 5fc4bc57
      libing4752 authored
      * modified schedule_dataflow_rewrite.cc to fix losing tensor problem
      
      * modified schedule_dataflow_rewrite.cc for lint scan
      
      * modified schedule_dataflow_rewrite.cc for lint scan
      
      * using tensor's value_index to index output of stage op
      
      * repare address offset for different kinds of dtype
      
      * bc
      
      * aaa
      
      * aaaaa
      
      * repare address for different dtypes
      
      * remove nonsense files
      
      * add whitespace of line 581
      
      * use base alloc elem_type
      
      * enhance the testcast of basic buffer is 64bits,32bits,16bits,8bits
      
      * use extends[0]->type() as dtype of offset
      
      * clear program writes
      5fc4bc57
  11. Jan 23, 2018
  12. Jan 22, 2018
    • Siju Samuel's avatar
      Update inject_virtual_thread.cc (#806) · f386bf5c
      Siju Samuel authored
      This compilation warning is fixed.
      src/pass/inject_virtual_thread.cc:43:19: warning: ‘rw_mask’ may be used uninitialized in this function [-Wmaybe-uninitialized]
             if (rw_mask & 2) {
                 ~~~~~~~~^~~
      f386bf5c
  13. Jan 20, 2018
  14. Jan 19, 2018
  15. Jan 16, 2018
  16. Jan 12, 2018
  17. Jan 11, 2018
  18. Jan 10, 2018
  19. Jan 09, 2018
  20. Jan 08, 2018
  21. Jan 07, 2018
    • xqdan's avatar
      [SCHEDULE]Improve bound deduce for loop partition (#743) (#755) · 9d6dbe34
      xqdan authored
      * [SCHEDULE]enable partition const loop with build flag (#719)
      
          * enable partition loop with build flag
      
          * add a testcase, and modify LoopPartition related cases
      
      *     * add document for split_const_loop
      
      * [IRbuild]Support automatically Name Loop Variable in IRBuilder (#719)
      
          * add idx_num in class
      
      * using typical index [i, j, k] first, then i_suffix
      
      * keep inputs names
      
      * fix lint
      
      * improve comment of name
      
      * fix lint
      
      * [SCHEDULE]Improve bound deduce for loop partition (#743)
      
          * add divided checking when deducing
      
          * related testcase
      
      * fix
      
      * * transform LE and GE first
      * remove is_equal
      * modify testcase for edge cases checking
      
      * * fix comment
      
      * * fix lint
      
      * * apply transformation form LT -> LE, GT -> GE
      
      * * fix lint
      
      * simplify code and testcase
      
      * add negative co-efficient case
      
      * More complicated cases
      
      * add testcase
      
      * simplify testcase
      
      * comment case for now
      
      * fix testcase
      9d6dbe34
  22. Jan 04, 2018
  23. Jan 03, 2018
  24. Jan 02, 2018
    • masahi's avatar
      [CONTRIB] cuBLAS integration (#744) · 3d5032ae
      masahi authored
      * add cublas support
      
      * integrate cublas to topi dense
      
      * add cublas error check
      
      * minor fix
      
      * fix lint
      
      * remove topi import from contrib unittest
      3d5032ae
  25. Dec 29, 2017
  26. Dec 27, 2017
  27. Dec 26, 2017
    • masahi's avatar
      [TOPI] add extern schedule for cudnn and miopen (#724) · cdb2f873
      masahi authored
      * add extern schedule for miopen
      
      * fix comment
      
      * optionally dispatch to miopen from topi
      
      * fix lint
      
      * check if current target is None
      
      * use generic dispatch for rocm conv2d
      
      * fix lint
      
      * fix workspace bug
      
      * remove blank line
      
      * remove blank line
      
      * remove blank line
      cdb2f873
Loading