Serkan Özal

Serkan Özal

Ankara, Ankara, Türkiye
6 B takipçi 500+ bağlantı

Hakkında

A Computer Engineering MSc on Big Data and Cloud Computing with Hadoop and Hive at METU…

Etkinlik

Deneyim

  • Amazon Web Services (AWS) Grafik
  • -

  • -

  • -

    Ankara, Turkey

  • -

  • -

    Boston, Massachusetts, United States

  • -

    Boston, Massachusetts, United States

  • -

    Boston, Massachusetts, United States

  • -

  • -

    Ankara, Turkey

  • -

    Ankara, Turkey

  • -

    Ankara, Turkey

  • -

  • -

  • -

    Ankara, Turkey

  • -

  • -

Eğitim

  • Orta Doğu Teknik Üniversitesi Grafik
  • -

    Thesis Subject: Multiple Query Optimization over Hadoop with Hive by Sharing Scan and Computationg Tasks

  • -

    Ranking: 4th

  • -

  • -

Lisanslar ve Sertifikalar

Yayınlar

  • Flying Server-less on the Cloud with AWS Lambda

    Serverless Turkey Meetup

    What is serverless and what AWS Lambda provides as FaaS service?

    Yayını gör
  • MySafe

    My Unsafe - Unsafe Interceptor, Native Memory Leak Tracker and Access Checker on the JVM

    MySafe intercepts (instruments) sun.misc.Unsafe calls and keeps records of allocated memories. So it can give the allocated memory informations, detect the invalid memory accesses and find origins of native memory leaks.

    Yayını gör
  • JVM Under the Hood

    What is going on under the hood in JVM?

    Agenda:
    - JVM Concepts
    - Memory Management
    - GC
    - Class Loading
    - Execution Engine
    - Multi-Threading
    -.JNI

    Yayını gör
  • Big Data on AWS

    Big data services on AWS:
    - Storage (S3, Glacier)
    - Analytics&Querying (DynamoDB, Redhshift, RDS, Elasticsearch, CloudSearch, QuickSight)
    - Processing (EMR, Kinesis, Lambda, Machine Learning)
    - Flow (Firehose, Data Pipeline, DMS, Snowball)

    Demo: https://github.com/serkan-ozal/ankaracloudmeetup-bigdata-demo

    Yayını gör
  • Ankara JUG Big Data Presentation

    Ankara JUG

    Presentation about Big Data Concepts, Map-Reduce and Hadoop for Ankara JUG December 2014 Meeting

    Yayını gör
  • Improving the performance of Hadoop Hive by sharing scan and computation tasks

    Journal of Cloud Computing

    MapReduce is a popular programming model for executing time-consuming analytical queries as a batch of tasks on large scale data clusters. In environments where multiple queries with similar selection predicates, common tables, and join tasks arrive simultaneously, many opportunities can arise for sharing scan and/or join computation tasks. Executing common tasks only once can remarkably reduce the total execution time of a batch of queries. In this study, we propose a Multiple Query…

    MapReduce is a popular programming model for executing time-consuming analytical queries as a batch of tasks on large scale data clusters. In environments where multiple queries with similar selection predicates, common tables, and join tasks arrive simultaneously, many opportunities can arise for sharing scan and/or join computation tasks. Executing common tasks only once can remarkably reduce the total execution time of a batch of queries. In this study, we propose a Multiple Query Optimization framework, SharedHive, to improve the overall performance of Hadoop Hive, an open source SQL-based data warehouse using MapReduce. SharedHive transforms a set of correlated HiveQL queries into a new set of insert queries that will produce all of the required outputs within a shorter execution time. It is experimentally shown that SharedHive achieves significant reductions in total execution times of TPC-H queries.

    Diğer yazarlar
    Yayını gör
  • AWS EMR - Amazon Elastic Map Reduce

    İçerik
    * Amazon EMR Nedir
    * Amazon EMR Ana Bileşenleri
    * Amazon EMR Özellikleri
    * Amazon EMR Kullanım Yöntemleri
    * Instance Tipleri
    * Cluster Tipleri
    * Amazon EMR Node Tipleri
    * Amazon EMR Mimarisi
    * Amazon EMR’ın Web Arayüzü İle Kullanımı

    Yayını gör
  • Dangerous Code: How to be Unsafe with Java Classes & Objects in Memory

    Rebellabs / ZeroTurnaround

    Let’s get laid out…the class and object structure of Java in memory:

    * How much space does a class take up in memory?
    * How much space do my objects consume in machine memory?
    * What’s the deal with the alignment of object properties in memory?

    Diğer yazarlar
    Yayını gör
  • Big Data Concepts

    Contents:

    1. Big Data and Scalability
    2. NoSQL
    2.1. Column Stores
    2.2. Key-Value Stores
    2.3. Document Stores
    2.4. Graph Database Systems
    3. Batch Data Processing
    3.1. MapReduce
    3.2. Hadoop
    3.3. Running Analytical Queries over Offline Big Data
    3.3.1. Hive
    3.3.2. Pig
    4. RealTime Data Processing
    4.1. Storm

    Yayını gör
  • Sharing Scan and Computation Tasks in Hadoop Hive

    -

    Experiments in our prototype, built on top of Hadoop, demonstrate the overall effectiveness of our approach and substantial savings. Our framework, Shared-Hive, transforms a batch of queries into a new batch that will be executed more efficiently, by merging jobs into groups and evaluating each group as a single query. Based on our cost model, we define an optimization problem and we provide a solution that derives the optimal grouping of queries. Experiments in our prototype, built on top of…

    Experiments in our prototype, built on top of Hadoop, demonstrate the overall effectiveness of our approach and substantial savings. Our framework, Shared-Hive, transforms a batch of queries into a new batch that will be executed more efficiently, by merging jobs into groups and evaluating each group as a single query. Based on our cost model, we define an optimization problem and we provide a solution that derives the optimal grouping of queries. Experiments in our prototype, built on top of Hadoop, demonstrate the overall effectiveness of our approach and substantial savings.

    Diğer yazarlar

Patentler

  • Cache Eviction in a Distributed Computing System

    Yayın tarihi US 62/151,326

    High-performance and constant time eviction for distributed computing system

    Diğer patent sahipleri

Projeler

  • Sirocco

    - Halen

    Sirocco, which is a character from Aladdin http://aladdin.wikia.com/wiki/Sirocco, is Java based Lambda infrastructure developed by OpsGenie as in-house framework. It has many unique features for AWS Lambda platform such as distributed/embedded monitoring (audit + stat + log), instrumentation, profiling, control requests, warmup, discovery, error handling/etrying over DLQ, etc ...

    This repository includes open-sourced modules of Sirocco. However, some parts are not open-sourced and used…

    Sirocco, which is a character from Aladdin http://aladdin.wikia.com/wiki/Sirocco, is Java based Lambda infrastructure developed by OpsGenie as in-house framework. It has many unique features for AWS Lambda platform such as distributed/embedded monitoring (audit + stat + log), instrumentation, profiling, control requests, warmup, discovery, error handling/etrying over DLQ, etc ...

    This repository includes open-sourced modules of Sirocco. However, some parts are not open-sourced and used internally by OpsGenie's AWS Lambda based serverless infrastructure and products.

    Tümünü gör
  • Thundra

    - Halen

    Thundra, which is a character from Aladdin http://aladdin.wikia.com/wiki/Thundra, is JVM based monitoring framework with audit (has instrumentation and profiling extensions), stat and log supports.

    Thundra provides the following monitoring infrastructures:

    - Audit: Auditing is used for tracing executions (method calls), collecting metrics and publishing the audit data to be analyzed later. It starts at a time, collects audit metrics, and then finishes. It is generally used for…

    Thundra, which is a character from Aladdin http://aladdin.wikia.com/wiki/Thundra, is JVM based monitoring framework with audit (has instrumentation and profiling extensions), stat and log supports.

    Thundra provides the following monitoring infrastructures:

    - Audit: Auditing is used for tracing executions (method calls), collecting metrics and publishing the audit data to be analyzed later. It starts at a time, collects audit metrics, and then finishes. It is generally used for tracing the execution during request/response based communication.

    - Stat: Stat infrastructure collects stats (cumulative or instantaneous) and publishes the stats data to be analyzed later. Stats can be application/environment specific (CPU stats, memory stats, etc ...), module/layer specific (cache stats, DynamoDB stats, etc ...) or domain specific (user stats, etc ..).

    - Log: Logging infrastructure provides a log4j logger (org.apache.log4j.Logger) which decorates logs with application/environment informations (application name, type, id, version, profile, host name, host IP, etc ...) and provided domain specific custom log properties (customer name, user name, etc ...).

    Tümünü gör
  • DynaCast

    Simply DynaCast is for caching AWS DynamoDB with Hazelcast and keeping them eventually consistent. DynaCast is a very simple caching library based on Hazelcast ("Cast" comes from here) on top of AWS DynamoDB ("Dyna" comes from here) with very basit caching functionalities (get, put, replace, remove) to be used as distributed or tiered (local + distributed).

    DynaCast caches data in-memory via Hazelcast as distributed internally and persists data into AWS DynamoDB. Under the hood, cache…

    Simply DynaCast is for caching AWS DynamoDB with Hazelcast and keeping them eventually consistent. DynaCast is a very simple caching library based on Hazelcast ("Cast" comes from here) on top of AWS DynamoDB ("Dyna" comes from here) with very basit caching functionalities (get, put, replace, remove) to be used as distributed or tiered (local + distributed).

    DynaCast caches data in-memory via Hazelcast as distributed internally and persists data into AWS DynamoDB. Under the hood, cache data in Hazelcast is stored as eventually consistent with AWS DynamoDB by receiving mutation events (ordered by the shard/partition) from AWS DynamoDB Streams.

    Tümünü gör
  • Samba

    In general Samba is a very simple caching library with very basit caching functionalities (get, put, replace, remove) to be used as local, global or tiered (local + global).

    Samba is designed for non-blocking cache access with lock-free algorithms from stratch. Therefore, being high-performant is one of the its major requirements. In addition, keeping its strong/eventual consistency model promise its another major requirement.

    Eventhough Samba can be useful for many cases as…

    In general Samba is a very simple caching library with very basit caching functionalities (get, put, replace, remove) to be used as local, global or tiered (local + global).

    Samba is designed for non-blocking cache access with lock-free algorithms from stratch. Therefore, being high-performant is one of the its major requirements. In addition, keeping its strong/eventual consistency model promise its another major requirement.

    Eventhough Samba can be useful for many cases as simple caching layer, at first, it is aimed to be used at AWS's Lambda service for sharing state/information between different Lambda function invocations whether on the same container (process) or another container (process/machine).

    Tümünü gör
  • MySafe

    MySafe is a framework (based on Jillegal-Agent) for managing memory accesses over sun.misc.Unsafe. MySafe intercepts (instruments) `sun.misc.Unsafe` calls and keeps records of allocated memories. So it can give the allocated memory informations and detect the invalid memory accesses.

    Tümünü gör
  • HermGen

    Hazelcast Based Distributed ClassLoader and PermGen

    Demo Application: https://github.com/serkan-ozal/hermgen-demo

    Tümünü gör
  • Jemstone

    Hidden gems of Java/JVM.

    Jemstone is a platform for running HotSpot Serviceability Agent API based implementations on current application (JVM process) or other application (other JVM process).

    Tümünü gör
  • Hazelcast-Aware

    Hazelcast-Aware is a Java Instrumentation API based Hazelcast extension to use Hazelcast data structures (Distributed maps, lists, sets, queues, objects, locks, topics, executers, entry listeners, etc ...) without interacting with HazelcastInstance class. You can specify which classes or fields will be Hazelcast aware by annotation or XML based configurations. Hazelcast-Aware scans classpath (classpath directories, dependent jar files and web application directories) of your application and…

    Hazelcast-Aware is a Java Instrumentation API based Hazelcast extension to use Hazelcast data structures (Distributed maps, lists, sets, queues, objects, locks, topics, executers, entry listeners, etc ...) without interacting with HazelcastInstance class. You can specify which classes or fields will be Hazelcast aware by annotation or XML based configurations. Hazelcast-Aware scans classpath (classpath directories, dependent jar files and web application directories) of your application and finds Hazelcast aware classes and fields, then instruments them. Demo application is avaiable at https://github.com/serkan-ozal/hazelcast-aware-demo.

    Tümünü gör
  • Spring-JDBC-ROMA 2.0

    Spring-JDBC-ROMA is a rowmapper extension for Spring-JDBC module. There is already a rowmapper named "org.springframework.jdbc.core.BeanPropertyRowMapper" for binding resultset attributes to object. But it is reflection based and can cause performance problems as Spring developers said. However Spring-JDBC-ROMA is not reflection based and it is byte code generation (with CGLib and Javassist) based rowmapper. It generates rowmapper on the fly like implementing as manual so it has no performance…

    Spring-JDBC-ROMA is a rowmapper extension for Spring-JDBC module. There is already a rowmapper named "org.springframework.jdbc.core.BeanPropertyRowMapper" for binding resultset attributes to object. But it is reflection based and can cause performance problems as Spring developers said. However Spring-JDBC-ROMA is not reflection based and it is byte code generation (with CGLib and Javassist) based rowmapper. It generates rowmapper on the fly like implementing as manual so it has no performance overhead. It also supports object relations as lazy and eager. There are other lots of interesting features and these features can be customized with developer's extended classes. It has some new and unique features like conditional lazy, conditional lazy object loading and conditional field ignoring. In addition, it has custom expression language named RXEL (ROMA Expression Language).

    Tümünü gör
  • T2 Big Data Hackathon

    - Halen

    Turkey's first big data hackathon.

    Assemble a team.Learn & Share. Make friends.
    Code Challenge and Project Challenge!
    Workshops & talks on Big Data and Data Science.
    Prizes, free goodies, free data science books.
    Free food & drinks, loads of snacks.

    Twitter hashtag #T2Hackathon

    Diğer oluşturanlar
    Tümünü gör
  • Jiagara

    High-Performance, Generic, Automated and Customizable Java Serialization/Deserialization Framework

    Jiagara First Simple Benchmark Results on 1000 Object (32 bytes sized) with Primitive Typed Properties:

    JVM Name: Java HotSpot(TM) 64-Bit Server VM
    JVM Version: 20.5-b03
    Java Version: 1.6.0_30
    Word Size: 8 byte
    Running 64-bit HotSpot VM.
    Using compressed references with 3-bit shift.
    Objects are 8 bytes aligned.

    Jiagara Serializer has been executed…

    High-Performance, Generic, Automated and Customizable Java Serialization/Deserialization Framework

    Jiagara First Simple Benchmark Results on 1000 Object (32 bytes sized) with Primitive Typed Properties:

    JVM Name: Java HotSpot(TM) 64-Bit Server VM
    JVM Version: 20.5-b03
    Java Version: 1.6.0_30
    Word Size: 8 byte
    Running 64-bit HotSpot VM.
    Using compressed references with 3-bit shift.
    Objects are 8 bytes aligned.

    Jiagara Serializer has been executed 10000 times in avg 42 milliseconds ...
    Kryo Serializer has been executed 10000 times in avg 103 milliseconds ...
    Avro Serializer has been executed 10000 times in avg 104 milliseconds ...
    Java Serializer has been executed 10000 times in avg 165 milliseconds ...
    Custom Serializer has been executed 10000 times in avg 140 milliseconds ...

  • Leshy

    Leshy is a framework for replacing default Java serialization with your custom implementation on the fly without any code change in your application by using Java Instrumentation API.

    Tümünü gör
  • SkyKeeper - Social Media Monitoring and Analyzing Platform

    SkyKeeper is a "Social Media Monitoring and Analyzing Platform" based on Cloud Computing.

    * Supports different kind of social media platforms such as Twitter, Facebook, FourSquare, etc.

    * Crawls social media data from different sources and archieves them on Amazon S3.

    * For monitoring, it runs different NLP algorithms (supervised, unsupervised and hybrid) with Map-Reduce pattern on Amazon Elastic MapReduce by Hadoop framework.

    * Analytical queries can be executed…

    SkyKeeper is a "Social Media Monitoring and Analyzing Platform" based on Cloud Computing.

    * Supports different kind of social media platforms such as Twitter, Facebook, FourSquare, etc.

    * Crawls social media data from different sources and archieves them on Amazon S3.

    * For monitoring, it runs different NLP algorithms (supervised, unsupervised and hybrid) with Map-Reduce pattern on Amazon Elastic MapReduce by Hadoop framework.

    * Analytical queries can be executed and reported on archived social media data with Hive framework.

    * SkyKeeper also supports live social media monitoring and analyzing by using Storm framework.

    Diğer oluşturanlar
    Tümünü gör
  • Spring-JDBC-ROMA

    - Halen

    Spring-JDBC-ROMA is a rowmapper extension for Spring-JDBC module. There is already a rowmapper named "org.springframework.jdbc.core.BeanPropertyRowMapper" for binding resultset attributes to object. But it is reflection based and can cause performance problems as Spring developers said. However Spring-JDBC-ROMA is not reflection based and it is byte code generation (with CGLib and Javassist) based rowmapper. It generates rowmapper on the fly like implementing as manual so it has no performance…

    Spring-JDBC-ROMA is a rowmapper extension for Spring-JDBC module. There is already a rowmapper named "org.springframework.jdbc.core.BeanPropertyRowMapper" for binding resultset attributes to object. But it is reflection based and can cause performance problems as Spring developers said. However Spring-JDBC-ROMA is not reflection based and it is byte code generation (with CGLib and Javassist) based rowmapper. It generates rowmapper on the fly like implementing as manual so it has no performance overhead. It also supports object relations as lazy and eager. There are other lots of interesting features and these features can be customized with developer's extended classes.

    Tümünü gör
  • Jillegal

    - Halen

    Jillegal is a library including unknown tricks of Java. It abstracts developer from low-level details to implement those tricks.

    Features:

    * Instrumenting and redefining any Java class, interface, ... (even core Java classes) at runtime with developer friendly API (with Builder Pattern based design) is supported. You can add your custom pre/post listeners to method and constructor invocations dynamically. It serves a platform to develop your custom AOP framework. It uses Java…

    Jillegal is a library including unknown tricks of Java. It abstracts developer from low-level details to implement those tricks.

    Features:

    * Instrumenting and redefining any Java class, interface, ... (even core Java classes) at runtime with developer friendly API (with Builder Pattern based design) is supported. You can add your custom pre/post listeners to method and constructor invocations dynamically. It serves a platform to develop your custom AOP framework. It uses Java Instrumentation API but adding extra VM argument (like "-javaagent:<jarpath>[=<options>]" is not required. JFree has internal agent and it can enable it's agent at runtime dynamically.

    * Accessing and setting any value at any address (as HEX address) in application is supported.

    * Accessing real memory address of any object is supported. So you can change any object in memory by getting its address and copying your custom object to its address.

    * Sequentially allocated object pool is supported. With this feature, all objects in pool are exist as sequential at memory, so sequential accessing to them is faster. Because, they will be fetched to CPU cache together as limited size of CPU cache.

    Tümünü gör
  • Ground Control Station Mission Systems project for Turkish UAV programme

    - Halen

    Worked as both lead software and system/test engineer. This project has embedded real-time components and non real-time components. Main technologies used for this project are safety critical real-time programming on top of Green Hill's Integrity RTOS, Stanag 4586 compliant UAV subsystem development, video streaming and processing, and IMA compliant avionics system development.

    Diğer oluşturanlar
  • SkyKeeper

    -

    Social Media Monitoring and Analysis Platform

    Diğer oluşturanlar
    Tümünü gör
  • Turkish Telecom Black/Grey List Management System

    -

    List management of the customers in risk groups according to their debts.
    Infrastructure of the project is based on cutting edge OSS like Spring, Spring Security, JPA, Hibernate itself with Search and Validator, Apache CXF and Vaadin.
    Handling millions of data streaming from different databases with Asynchronous Job Infrastructure of Quartz.
    Implementation of clustered high-throughput web services, 6k-req/min, with Apache CXF empowered with WS-* Standards.
    Similarity search with…

    List management of the customers in risk groups according to their debts.
    Infrastructure of the project is based on cutting edge OSS like Spring, Spring Security, JPA, Hibernate itself with Search and Validator, Apache CXF and Vaadin.
    Handling millions of data streaming from different databases with Asynchronous Job Infrastructure of Quartz.
    Implementation of clustered high-throughput web services, 6k-req/min, with Apache CXF empowered with WS-* Standards.
    Similarity search with Hibernate search crawling millions of data within a second.
    Dynamic Rule Engine that can be modified at runtime without the need of deployment.
    Clustered Caching with Hazelcast.

    Diğer oluşturanlar
  • TTekir - Türk Telekom Kural ve İş Robotu

    -

    TTekir is a flexible and generic rule engine to be used in Turkish Telecom.

    Rules can be defined in Java and Groovy on the fly and connected to each other by defined conditions on workflow with a drag-drop supported web-based editor without the need of deployment.

    Existing defined rule and conditions classes are redefined on the fly without the need of restart so TTekir has its small custom and independent OSGI container.

    Pluggable data access module is supported…

    TTekir is a flexible and generic rule engine to be used in Turkish Telecom.

    Rules can be defined in Java and Groovy on the fly and connected to each other by defined conditions on workflow with a drag-drop supported web-based editor without the need of deployment.

    Existing defined rule and conditions classes are redefined on the fly without the need of restart so TTekir has its small custom and independent OSGI container.

    Pluggable data access module is supported.

    Rules, conditions and workflows are cached as distributed, so response time is about 1-2 milliseconds.

    Workflows are served as web-service.

    All web-service calls and workflow executions are logged as hierarchical and can be queried fastly.

    Diğer oluşturanlar
  • KamGURU - Kampanya Gurusu

    -

    KamGURU is a flexible rule engine to be used for Campaign Eligibility in Turkish Telecom.

    * Flexible rules developed in Groovy without the need of deployment.

    * Easily implement new rules with parameters.

    * Analysis of the new rule and condition requirements.

    * Analysis and implementation of the new rules.

    * Unit and accpetance tests of the implemented rules.

    Diğer oluşturanlar

Onur ve Ödüller

  • 2024 OpenTelemetry Community Awards Winner

    OpenTelemetry

    https://opentelemetry.io/blog/2024/community-awards-winners

  • Graduate Education Scholarship

    -

    TÜBİTAK (The Scientific and Technological Research Council of Turkey)

Sınav Puanları

  • ALES

    Puan: 97 / 100

  • KPDS

    Puan: 80 / 100

  • ÜDS

    Puan: 83 / 100

Diller

  • English

    Profesyonel çalışma yetkinliği

  • Turkish

    Ana dil veya ikinci dil yetkinliği

Serkan Özal adlı üyenin tam profilini görüntüleyin

  • Ortak tanıdıklarınızı görün
  • Başka biri aracılığıyla tanış
  • Serkan Özal ile doğrudan iletişime geçin
Tam profili görüntülemek için katılın

Diğer benzer profiller

Bu kurslarla yeni yetenekler ekleyin