Adi Fuchs

Adi Fuchs

Herzliya, Tel Aviv District, Israel
2K‏ עוקבים מעל 500 קשרים

על אודות

I am a lead core architect at Speedata, a startup developing cutting-edge architecture…

פעילות

הצטרפו עכשיו כדי לראות את כל פעילות

ניסיון

  • Speedata.io גרפי
  • -

    Israel

  • -

    Palo Alto, California, United States

  • -

  • -

    New York, New York

  • -

    Haifa, Israel

  • -

    Haifa

  • -

    Yokne'am

  • -

    Haifa

  • -

  • -

    Haifa

חינוך

  • Princeton University גרפי
  • פעילויות וחברות:Specializing in machine learning techniques for computer architectures and operating systems.

    Graduated with highest honors (Summa Cum Laude)

  • Graduated with honors (Cum Laude)

פטנטים

  • Lossless Tiling in Convolution Networks—Padding Before Tiling, Location-Based Tiling, and Zeroing-Out

    שהונפקו 11263170

    Disclosed is a data processing system to receive a processing graph of an application. A compile time logic is configured to modify the processing graph and generate a modified processing graph. The modified processing graph is configured to apply a post-padding tiling after applying a cumulative input padding that confines padding to an input. The cumulative input padding pads the input into a padded input. The post-padding tiling tiles the padded input into a set of pre-padded input tiles…

    Disclosed is a data processing system to receive a processing graph of an application. A compile time logic is configured to modify the processing graph and generate a modified processing graph. The modified processing graph is configured to apply a post-padding tiling after applying a cumulative input padding that confines padding to an input. The cumulative input padding pads the input into a padded input. The post-padding tiling tiles the padded input into a set of pre-padded input tiles with a same tile size, tiles intermediate representation of the input into a set of intermediate tiles with a same tile size, and tiles output representation of the input into a set of non-overlapping output tiles with a same tile size. Runtime logic is configured with the compile time logic to execute the modified processing graph to execute the application.

    ראה פטנט
  • Lossless Tiling in Convolution Networks—Read-Modify-Write in Backward Pass

    שהונפקו 11250061

    The technology disclosed relates to enhanced tiling within a neural network, which can be implemented using processors like Central Processing Units (CPUs), Graphics Processing Units (GPUs), Field Programmable Gate Arrays (FPGAs), Coarse-Grained Reconfigurable Architectures (CGRAs), Application-Specific Integrated Circuits (ASICs), Application Specific Instruction-set Processor (ASIP), and Digital Signal Processors (DSPs). In particular, the technology disclosed relates to using tiling to…

    The technology disclosed relates to enhanced tiling within a neural network, which can be implemented using processors like Central Processing Units (CPUs), Graphics Processing Units (GPUs), Field Programmable Gate Arrays (FPGAs), Coarse-Grained Reconfigurable Architectures (CGRAs), Application-Specific Integrated Circuits (ASICs), Application Specific Instruction-set Processor (ASIP), and Digital Signal Processors (DSPs). In particular, the technology disclosed relates to using tiling to process relatively large input sizes.

    ראה פטנט
  • Lossless Tiling in Convolution Networks—Weight Gradient Calculation

    שהונפקו 11232360

    Disclosed is a data processing system that includes compile time logic configured to process a processing graph to generate a modified processing graph, which includes a plurality of forward processing nodes of a forward pass and a plurality of backward processing nodes of a backward pass. The data processing system also includes runtime logic configured with the compile time logic to execute the modified processing graph to generate, at a backward processing node of the plurality of backward…

    Disclosed is a data processing system that includes compile time logic configured to process a processing graph to generate a modified processing graph, which includes a plurality of forward processing nodes of a forward pass and a plurality of backward processing nodes of a backward pass. The data processing system also includes runtime logic configured with the compile time logic to execute the modified processing graph to generate, at a backward processing node of the plurality of backward processing nodes, a plurality of partial weight gradients, based on processing a corresponding plurality of gradient tiles of a gradient tensor, and generate, based on the plurality of partial weight gradients, a final weight gradient corresponding to the gradient tensor.

    ראה פטנט
  • Lossless Tiling in Convolution Networks—Section Boundaries

    שהונפקו US 11227207

    Disclosed is a data processing system that includes compile time logic to section a graph into a sequence of sections, configure a first section to generate a first set of output tiles in a first target tiling configuration in response to processing a first set of input tiles in a first input tiling configuration, and configure a second section to generate a second set of output tiles in a second target tiling configuration in response to processing the first set of output tiles in a second…

    Disclosed is a data processing system that includes compile time logic to section a graph into a sequence of sections, configure a first section to generate a first set of output tiles in a first target tiling configuration in response to processing a first set of input tiles in a first input tiling configuration, and configure a second section to generate a second set of output tiles in a second target tiling configuration in response to processing the first set of output tiles in a second input tiling configuration. Runtime logic is configured to pad a first input into a first padded input, read the first set of input tiles from the first padded input in the first input tiling configuration, and process the first set of input tiles through the first section to generate the first set of output tiles in the first target tiling configuration.

    ראה פטנט
  • Lossless tiling in convolution networks—tiling configuration

    שהונפקו 11195080

    Disclosed is a data processing system that includes compile time logic configured to section a graph into a sequence of sections, and configure each section of the sequence of sections such that an input layer of a section processes an input, one or more intermediate layers of the corresponding section processes corresponding one or more intermediate outputs, and a final layer of the corresponding section generates a final output. The final output has a non-overlapping final tiling…

    Disclosed is a data processing system that includes compile time logic configured to section a graph into a sequence of sections, and configure each section of the sequence of sections such that an input layer of a section processes an input, one or more intermediate layers of the corresponding section processes corresponding one or more intermediate outputs, and a final layer of the corresponding section generates a final output. The final output has a non-overlapping final tiling configuration, the one or more intermediate outputs have corresponding one or more overlapping intermediate tiling configurations, and the input has an overlapping input tiling configuration. The compile time logic is further to determine the various tiling configurations by starting from the final layer and reverse traversing through the one or more intermediate layers, and ending with the input layer.

    ראה פטנט
  • Host channel adapter with pattern-type dma

    שהונפקו US US20130166793 A1

    An input/output (I/O) device includes a memory buffer and off-loading hardware. The off-loading hardware is configured to accept from a host a scatter/gather list including one or more entries. The entries include at least a pattern-type entry that specifies a period of a periodic pattern of addresses that are to be accessed in a memory of the host. The off-loading hardware is configured to transfer data between the memory buffer of the I/O device and the memory of the host by accessing the…

    An input/output (I/O) device includes a memory buffer and off-loading hardware. The off-loading hardware is configured to accept from a host a scatter/gather list including one or more entries. The entries include at least a pattern-type entry that specifies a period of a periodic pattern of addresses that are to be accessed in a memory of the host. The off-loading hardware is configured to transfer data between the memory buffer of the I/O device and the memory of the host by accessing the addresses in the memory of the host in accordance with the periodic pattern at intervals indicated in the period.

    אַחֵר ממציאים
    • Ariel Shahar
    • Noam Bloch
    ראה פטנט

כבוד ופרסים

  • 2019 Top picks from the computer architecture conferences (honorable mention)

    IEEE

    The HPCA 2019 paper on the accelerator wall was selected as honorable mention for top computer architecture papers recognize to influential the work of computer architects for years to come.

  • Communications of the ACM Research highlights

    ACM

    Research highlights for the ASPLOS 2016 OpenPiton paper

  • HPCA 2019 Best paper runner up

    IEEE

    Best paper runner up awarded to the Accelerator Wall paper

  • ASPLOS 2016 Best paper runner up

    ACM

    Best paper runner up awarded to the OpenPiton paper

  • 2014 HiPEAC paper award

    European Network of Excellence on High Performance and Embedded Architecture and Compilation

    Paper awarded: "Loop-Aware Memory Prefetching Using Code Block Working Sets", Fuchs, A. , Mannor, S., Weiser, U., Etsion, Y.

  • 2013 Intel Award

    Intel corporation

    2013 Intel award recipient for excellence in research. On the study titled: "Task Differentials: Dynamic, inter-thread predictions using memory access footsteps"

  • Technon EE Dept. Dean's list

    Technion EE Dept.

  • Excellent Project Award

    Technion EE Dept. Computer Graphics Lab

    Hand gesture classification using depth images - tested to classify American Sign Language gestures (possible real-time applications presented)
    http://cgm.technion.ac.il/Computer-Graphics-Multimedia/Undergraduate-Projects/2009/SignLanguage/ProjectWeb/demo.html

  • Technion President's List

    Technion - IIT

עוד פעילות על ידי Adi

הצג Adi את הפרופיל המלא

  • ראה את מי שאתה מכיר במשותף
  • הכירו
  • צור קשר Adi ישירות
הצטרפו נוף הפרופיל המלא

פרופילים דומים אחרים

שמות אחרים Adi Fuchs ב Israel

הוסף כישורים חדשים עם קורסים אלה