Hakkında
A Computer Engineering MSc on Big Data and Cloud Computing with Hadoop and Hive at METU…
Etkinlik
-
Allocating more CPU during initialization by AWS Lambda has been there since long time. But since it is not officially documented, it is not well…
Allocating more CPU during initialization by AWS Lambda has been there since long time. But since it is not officially documented, it is not well…
Serkan Özal tarafından paylaşıldı
-
Good news! I have been working on a scalable low latency vector database for a while. In June, together with Kevin Hanson and Zoltán Baranyi, we…
Good news! I have been working on a scalable low latency vector database for a while. In June, together with Kevin Hanson and Zoltán Baranyi, we…
Serkan Özal tarafından beğenildi
-
I just published the 3rd part of the series "Spring Boot 3 application on AWS Lambda" which is "Spring Cloud Function and its AWS Lambda Adapter" in…
I just published the 3rd part of the series "Spring Boot 3 application on AWS Lambda" which is "Spring Cloud Function and its AWS Lambda Adapter" in…
Serkan Özal tarafından beğenildi
Deneyim
Eğitim
Lisanslar ve Sertifikalar
-
AWS Certified DevOps Engineer - Professional
Amazon Web Services (AWS)
tarihinde verildi tarihinde sonra ererYeterlilik Kimliği AWS-DOP-C01-QXFLXX56 -
AWS Certified Developer - Associate
Amazon Web Services (AWS)
tarihinde verildi tarihinde sonra ererYeterlilik Kimliği AWS-DVA-C01-650991
Yayınlar
-
Flying Server-less on the Cloud with AWS Lambda
Serverless Turkey Meetup
What is serverless and what AWS Lambda provides as FaaS service?
-
MySafe
My Unsafe - Unsafe Interceptor, Native Memory Leak Tracker and Access Checker on the JVM
MySafe intercepts (instruments) sun.misc.Unsafe calls and keeps records of allocated memories. So it can give the allocated memory informations, detect the invalid memory accesses and find origins of native memory leaks. -
JVM Under the Hood
What is going on under the hood in JVM?
Agenda:
- JVM Concepts
- Memory Management
- GC
- Class Loading
- Execution Engine
- Multi-Threading
-.JNI -
Big Data on AWS
Big data services on AWS:
- Storage (S3, Glacier)
- Analytics&Querying (DynamoDB, Redhshift, RDS, Elasticsearch, CloudSearch, QuickSight)
- Processing (EMR, Kinesis, Lambda, Machine Learning)
- Flow (Firehose, Data Pipeline, DMS, Snowball)
Demo: https://github.com/serkan-ozal/ankaracloudmeetup-bigdata-demo -
Ankara JUG Big Data Presentation
Ankara JUG
Presentation about Big Data Concepts, Map-Reduce and Hadoop for Ankara JUG December 2014 Meeting
-
Improving the performance of Hadoop Hive by sharing scan and computation tasks
Journal of Cloud Computing
MapReduce is a popular programming model for executing time-consuming analytical queries as a batch of tasks on large scale data clusters. In environments where multiple queries with similar selection predicates, common tables, and join tasks arrive simultaneously, many opportunities can arise for sharing scan and/or join computation tasks. Executing common tasks only once can remarkably reduce the total execution time of a batch of queries. In this study, we propose a Multiple Query…
MapReduce is a popular programming model for executing time-consuming analytical queries as a batch of tasks on large scale data clusters. In environments where multiple queries with similar selection predicates, common tables, and join tasks arrive simultaneously, many opportunities can arise for sharing scan and/or join computation tasks. Executing common tasks only once can remarkably reduce the total execution time of a batch of queries. In this study, we propose a Multiple Query Optimization framework, SharedHive, to improve the overall performance of Hadoop Hive, an open source SQL-based data warehouse using MapReduce. SharedHive transforms a set of correlated HiveQL queries into a new set of insert queries that will produce all of the required outputs within a shorter execution time. It is experimentally shown that SharedHive achieves significant reductions in total execution times of TPC-H queries.
Diğer yazarlarYayını gör -
AWS EMR - Amazon Elastic Map Reduce
İçerik
* Amazon EMR Nedir
* Amazon EMR Ana Bileşenleri
* Amazon EMR Özellikleri
* Amazon EMR Kullanım Yöntemleri
* Instance Tipleri
* Cluster Tipleri
* Amazon EMR Node Tipleri
* Amazon EMR Mimarisi
* Amazon EMR’ın Web Arayüzü İle Kullanımı -
Dangerous Code: How to be Unsafe with Java Classes & Objects in Memory
Rebellabs / ZeroTurnaround
Let’s get laid out…the class and object structure of Java in memory:
* How much space does a class take up in memory?
* How much space do my objects consume in machine memory?
* What’s the deal with the alignment of object properties in memory?Diğer yazarlarYayını gör -
Big Data Concepts
Contents:
1. Big Data and Scalability
2. NoSQL
2.1. Column Stores
2.2. Key-Value Stores
2.3. Document Stores
2.4. Graph Database Systems
3. Batch Data Processing
3.1. MapReduce
3.2. Hadoop
3.3. Running Analytical Queries over Offline Big Data
3.3.1. Hive
3.3.2. Pig
4. RealTime Data Processing
4.1. Storm
-
Sharing Scan and Computation Tasks in Hadoop Hive
-
Experiments in our prototype, built on top of Hadoop, demonstrate the overall effectiveness of our approach and substantial savings. Our framework, Shared-Hive, transforms a batch of queries into a new batch that will be executed more efficiently, by merging jobs into groups and evaluating each group as a single query. Based on our cost model, we define an optimization problem and we provide a solution that derives the optimal grouping of queries. Experiments in our prototype, built on top of…
Experiments in our prototype, built on top of Hadoop, demonstrate the overall effectiveness of our approach and substantial savings. Our framework, Shared-Hive, transforms a batch of queries into a new batch that will be executed more efficiently, by merging jobs into groups and evaluating each group as a single query. Based on our cost model, we define an optimization problem and we provide a solution that derives the optimal grouping of queries. Experiments in our prototype, built on top of Hadoop, demonstrate the overall effectiveness of our approach and substantial savings.
Diğer yazarlar
Patentler
Projeler
-
Sirocco
- Halen
Sirocco, which is a character from Aladdin http://aladdin.wikia.com/wiki/Sirocco, is Java based Lambda infrastructure developed by OpsGenie as in-house framework. It has many unique features for AWS Lambda platform such as distributed/embedded monitoring (audit + stat + log), instrumentation, profiling, control requests, warmup, discovery, error handling/etrying over DLQ, etc ...
This repository includes open-sourced modules of Sirocco. However, some parts are not open-sourced and used…Sirocco, which is a character from Aladdin http://aladdin.wikia.com/wiki/Sirocco, is Java based Lambda infrastructure developed by OpsGenie as in-house framework. It has many unique features for AWS Lambda platform such as distributed/embedded monitoring (audit + stat + log), instrumentation, profiling, control requests, warmup, discovery, error handling/etrying over DLQ, etc ...
This repository includes open-sourced modules of Sirocco. However, some parts are not open-sourced and used internally by OpsGenie's AWS Lambda based serverless infrastructure and products. -
Thundra
- Halen
Thundra, which is a character from Aladdin http://aladdin.wikia.com/wiki/Thundra, is JVM based monitoring framework with audit (has instrumentation and profiling extensions), stat and log supports.
Thundra provides the following monitoring infrastructures:
- Audit: Auditing is used for tracing executions (method calls), collecting metrics and publishing the audit data to be analyzed later. It starts at a time, collects audit metrics, and then finishes. It is generally used for…Thundra, which is a character from Aladdin http://aladdin.wikia.com/wiki/Thundra, is JVM based monitoring framework with audit (has instrumentation and profiling extensions), stat and log supports.
Thundra provides the following monitoring infrastructures:
- Audit: Auditing is used for tracing executions (method calls), collecting metrics and publishing the audit data to be analyzed later. It starts at a time, collects audit metrics, and then finishes. It is generally used for tracing the execution during request/response based communication.
- Stat: Stat infrastructure collects stats (cumulative or instantaneous) and publishes the stats data to be analyzed later. Stats can be application/environment specific (CPU stats, memory stats, etc ...), module/layer specific (cache stats, DynamoDB stats, etc ...) or domain specific (user stats, etc ..).
- Log: Logging infrastructure provides a log4j logger (org.apache.log4j.Logger) which decorates logs with application/environment informations (application name, type, id, version, profile, host name, host IP, etc ...) and provided domain specific custom log properties (customer name, user name, etc ...). -
DynaCast
Simply DynaCast is for caching AWS DynamoDB with Hazelcast and keeping them eventually consistent. DynaCast is a very simple caching library based on Hazelcast ("Cast" comes from here) on top of AWS DynamoDB ("Dyna" comes from here) with very basit caching functionalities (get, put, replace, remove) to be used as distributed or tiered (local + distributed).
DynaCast caches data in-memory via Hazelcast as distributed internally and persists data into AWS DynamoDB. Under the hood, cache…Simply DynaCast is for caching AWS DynamoDB with Hazelcast and keeping them eventually consistent. DynaCast is a very simple caching library based on Hazelcast ("Cast" comes from here) on top of AWS DynamoDB ("Dyna" comes from here) with very basit caching functionalities (get, put, replace, remove) to be used as distributed or tiered (local + distributed).
DynaCast caches data in-memory via Hazelcast as distributed internally and persists data into AWS DynamoDB. Under the hood, cache data in Hazelcast is stored as eventually consistent with AWS DynamoDB by receiving mutation events (ordered by the shard/partition) from AWS DynamoDB Streams. -
Samba
In general Samba is a very simple caching library with very basit caching functionalities (get, put, replace, remove) to be used as local, global or tiered (local + global).
Samba is designed for non-blocking cache access with lock-free algorithms from stratch. Therefore, being high-performant is one of the its major requirements. In addition, keeping its strong/eventual consistency model promise its another major requirement.
Eventhough Samba can be useful for many cases as…In general Samba is a very simple caching library with very basit caching functionalities (get, put, replace, remove) to be used as local, global or tiered (local + global).
Samba is designed for non-blocking cache access with lock-free algorithms from stratch. Therefore, being high-performant is one of the its major requirements. In addition, keeping its strong/eventual consistency model promise its another major requirement.
Eventhough Samba can be useful for many cases as simple caching layer, at first, it is aimed to be used at AWS's Lambda service for sharing state/information between different Lambda function invocations whether on the same container (process) or another container (process/machine). -
MySafe
MySafe is a framework (based on Jillegal-Agent) for managing memory accesses over sun.misc.Unsafe. MySafe intercepts (instruments) `sun.misc.Unsafe` calls and keeps records of allocated memories. So it can give the allocated memory informations and detect the invalid memory accesses.
-
HermGen
Hazelcast Based Distributed ClassLoader and PermGen
Demo Application: https://github.com/serkan-ozal/hermgen-demo -
Jemstone
Hidden gems of Java/JVM.
Jemstone is a platform for running HotSpot Serviceability Agent API based implementations on current application (JVM process) or other application (other JVM process). -
Hazelcast-Aware
Hazelcast-Aware is a Java Instrumentation API based Hazelcast extension to use Hazelcast data structures (Distributed maps, lists, sets, queues, objects, locks, topics, executers, entry listeners, etc ...) without interacting with HazelcastInstance class. You can specify which classes or fields will be Hazelcast aware by annotation or XML based configurations. Hazelcast-Aware scans classpath (classpath directories, dependent jar files and web application directories) of your application and…
Hazelcast-Aware is a Java Instrumentation API based Hazelcast extension to use Hazelcast data structures (Distributed maps, lists, sets, queues, objects, locks, topics, executers, entry listeners, etc ...) without interacting with HazelcastInstance class. You can specify which classes or fields will be Hazelcast aware by annotation or XML based configurations. Hazelcast-Aware scans classpath (classpath directories, dependent jar files and web application directories) of your application and finds Hazelcast aware classes and fields, then instruments them. Demo application is avaiable at https://github.com/serkan-ozal/hazelcast-aware-demo.
-
Spring-JDBC-ROMA 2.0
Spring-JDBC-ROMA is a rowmapper extension for Spring-JDBC module. There is already a rowmapper named "org.springframework.jdbc.core.BeanPropertyRowMapper" for binding resultset attributes to object. But it is reflection based and can cause performance problems as Spring developers said. However Spring-JDBC-ROMA is not reflection based and it is byte code generation (with CGLib and Javassist) based rowmapper. It generates rowmapper on the fly like implementing as manual so it has no performance…
Spring-JDBC-ROMA is a rowmapper extension for Spring-JDBC module. There is already a rowmapper named "org.springframework.jdbc.core.BeanPropertyRowMapper" for binding resultset attributes to object. But it is reflection based and can cause performance problems as Spring developers said. However Spring-JDBC-ROMA is not reflection based and it is byte code generation (with CGLib and Javassist) based rowmapper. It generates rowmapper on the fly like implementing as manual so it has no performance overhead. It also supports object relations as lazy and eager. There are other lots of interesting features and these features can be customized with developer's extended classes. It has some new and unique features like conditional lazy, conditional lazy object loading and conditional field ignoring. In addition, it has custom expression language named RXEL (ROMA Expression Language).
-
T2 Big Data Hackathon
- Halen
Turkey's first big data hackathon.
Assemble a team.Learn & Share. Make friends.
Code Challenge and Project Challenge!
Workshops & talks on Big Data and Data Science.
Prizes, free goodies, free data science books.
Free food & drinks, loads of snacks.
Twitter hashtag #T2HackathonDiğer oluşturanlarTümünü gör -
Jiagara
High-Performance, Generic, Automated and Customizable Java Serialization/Deserialization Framework
Jiagara First Simple Benchmark Results on 1000 Object (32 bytes sized) with Primitive Typed Properties:
JVM Name: Java HotSpot(TM) 64-Bit Server VM
JVM Version: 20.5-b03
Java Version: 1.6.0_30
Word Size: 8 byte
Running 64-bit HotSpot VM.
Using compressed references with 3-bit shift.
Objects are 8 bytes aligned.
Jiagara Serializer has been executed…High-Performance, Generic, Automated and Customizable Java Serialization/Deserialization Framework
Jiagara First Simple Benchmark Results on 1000 Object (32 bytes sized) with Primitive Typed Properties:
JVM Name: Java HotSpot(TM) 64-Bit Server VM
JVM Version: 20.5-b03
Java Version: 1.6.0_30
Word Size: 8 byte
Running 64-bit HotSpot VM.
Using compressed references with 3-bit shift.
Objects are 8 bytes aligned.
Jiagara Serializer has been executed 10000 times in avg 42 milliseconds ...
Kryo Serializer has been executed 10000 times in avg 103 milliseconds ...
Avro Serializer has been executed 10000 times in avg 104 milliseconds ...
Java Serializer has been executed 10000 times in avg 165 milliseconds ...
Custom Serializer has been executed 10000 times in avg 140 milliseconds ...
-
Leshy
Leshy is a framework for replacing default Java serialization with your custom implementation on the fly without any code change in your application by using Java Instrumentation API.
-
SkyKeeper - Social Media Monitoring and Analyzing Platform
SkyKeeper is a "Social Media Monitoring and Analyzing Platform" based on Cloud Computing.
* Supports different kind of social media platforms such as Twitter, Facebook, FourSquare, etc.
* Crawls social media data from different sources and archieves them on Amazon S3.
* For monitoring, it runs different NLP algorithms (supervised, unsupervised and hybrid) with Map-Reduce pattern on Amazon Elastic MapReduce by Hadoop framework.
* Analytical queries can be executed…SkyKeeper is a "Social Media Monitoring and Analyzing Platform" based on Cloud Computing.
* Supports different kind of social media platforms such as Twitter, Facebook, FourSquare, etc.
* Crawls social media data from different sources and archieves them on Amazon S3.
* For monitoring, it runs different NLP algorithms (supervised, unsupervised and hybrid) with Map-Reduce pattern on Amazon Elastic MapReduce by Hadoop framework.
* Analytical queries can be executed and reported on archived social media data with Hive framework.
* SkyKeeper also supports live social media monitoring and analyzing by using Storm framework.Diğer oluşturanlarTümünü gör -
Spring-JDBC-ROMA
- Halen
Spring-JDBC-ROMA is a rowmapper extension for Spring-JDBC module. There is already a rowmapper named "org.springframework.jdbc.core.BeanPropertyRowMapper" for binding resultset attributes to object. But it is reflection based and can cause performance problems as Spring developers said. However Spring-JDBC-ROMA is not reflection based and it is byte code generation (with CGLib and Javassist) based rowmapper. It generates rowmapper on the fly like implementing as manual so it has no performance…
Spring-JDBC-ROMA is a rowmapper extension for Spring-JDBC module. There is already a rowmapper named "org.springframework.jdbc.core.BeanPropertyRowMapper" for binding resultset attributes to object. But it is reflection based and can cause performance problems as Spring developers said. However Spring-JDBC-ROMA is not reflection based and it is byte code generation (with CGLib and Javassist) based rowmapper. It generates rowmapper on the fly like implementing as manual so it has no performance overhead. It also supports object relations as lazy and eager. There are other lots of interesting features and these features can be customized with developer's extended classes.
-
Jillegal
- Halen
Jillegal is a library including unknown tricks of Java. It abstracts developer from low-level details to implement those tricks.
Features:
* Instrumenting and redefining any Java class, interface, ... (even core Java classes) at runtime with developer friendly API (with Builder Pattern based design) is supported. You can add your custom pre/post listeners to method and constructor invocations dynamically. It serves a platform to develop your custom AOP framework. It uses Java…Jillegal is a library including unknown tricks of Java. It abstracts developer from low-level details to implement those tricks.
Features:
* Instrumenting and redefining any Java class, interface, ... (even core Java classes) at runtime with developer friendly API (with Builder Pattern based design) is supported. You can add your custom pre/post listeners to method and constructor invocations dynamically. It serves a platform to develop your custom AOP framework. It uses Java Instrumentation API but adding extra VM argument (like "-javaagent:<jarpath>[=<options>]" is not required. JFree has internal agent and it can enable it's agent at runtime dynamically.
* Accessing and setting any value at any address (as HEX address) in application is supported.
* Accessing real memory address of any object is supported. So you can change any object in memory by getting its address and copying your custom object to its address.
* Sequentially allocated object pool is supported. With this feature, all objects in pool are exist as sequential at memory, so sequential accessing to them is faster. Because, they will be fetched to CPU cache together as limited size of CPU cache.
-
Ground Control Station Mission Systems project for Turkish UAV programme
- Halen
Worked as both lead software and system/test engineer. This project has embedded real-time components and non real-time components. Main technologies used for this project are safety critical real-time programming on top of Green Hill's Integrity RTOS, Stanag 4586 compliant UAV subsystem development, video streaming and processing, and IMA compliant avionics system development.
Diğer oluşturanlar -
Turkish Telecom Black/Grey List Management System
-
List management of the customers in risk groups according to their debts.
Infrastructure of the project is based on cutting edge OSS like Spring, Spring Security, JPA, Hibernate itself with Search and Validator, Apache CXF and Vaadin.
Handling millions of data streaming from different databases with Asynchronous Job Infrastructure of Quartz.
Implementation of clustered high-throughput web services, 6k-req/min, with Apache CXF empowered with WS-* Standards.
Similarity search with…List management of the customers in risk groups according to their debts.
Infrastructure of the project is based on cutting edge OSS like Spring, Spring Security, JPA, Hibernate itself with Search and Validator, Apache CXF and Vaadin.
Handling millions of data streaming from different databases with Asynchronous Job Infrastructure of Quartz.
Implementation of clustered high-throughput web services, 6k-req/min, with Apache CXF empowered with WS-* Standards.
Similarity search with Hibernate search crawling millions of data within a second.
Dynamic Rule Engine that can be modified at runtime without the need of deployment.
Clustered Caching with Hazelcast.Diğer oluşturanlar -
TTekir - Türk Telekom Kural ve İş Robotu
-
TTekir is a flexible and generic rule engine to be used in Turkish Telecom.
Rules can be defined in Java and Groovy on the fly and connected to each other by defined conditions on workflow with a drag-drop supported web-based editor without the need of deployment.
Existing defined rule and conditions classes are redefined on the fly without the need of restart so TTekir has its small custom and independent OSGI container.
Pluggable data access module is supported…TTekir is a flexible and generic rule engine to be used in Turkish Telecom.
Rules can be defined in Java and Groovy on the fly and connected to each other by defined conditions on workflow with a drag-drop supported web-based editor without the need of deployment.
Existing defined rule and conditions classes are redefined on the fly without the need of restart so TTekir has its small custom and independent OSGI container.
Pluggable data access module is supported.
Rules, conditions and workflows are cached as distributed, so response time is about 1-2 milliseconds.
Workflows are served as web-service.
All web-service calls and workflow executions are logged as hierarchical and can be queried fastly.Diğer oluşturanlar -
KamGURU - Kampanya Gurusu
-
KamGURU is a flexible rule engine to be used for Campaign Eligibility in Turkish Telecom.
* Flexible rules developed in Groovy without the need of deployment.
* Easily implement new rules with parameters.
* Analysis of the new rule and condition requirements.
* Analysis and implementation of the new rules.
* Unit and accpetance tests of the implemented rules.Diğer oluşturanlar
Onur ve Ödüller
-
2024 OpenTelemetry Community Awards Winner
OpenTelemetry
https://opentelemetry.io/blog/2024/community-awards-winners
-
Graduate Education Scholarship
-
TÜBİTAK (The Scientific and Technological Research Council of Turkey)
Sınav Puanları
-
ALES
Puan: 97 / 100
-
KPDS
Puan: 80 / 100
-
ÜDS
Puan: 83 / 100
Diller
-
English
Profesyonel çalışma yetkinliği
-
Turkish
Ana dil veya ikinci dil yetkinliği